[PATCH] coredump: Retry writes where appropriate

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH] coredump: Retry writes where appropriate
@ 2009-05-31  5:33 Paul Smith
  2009-05-31 10:18 ` Alan Cox
  0 siblings, 1 reply; 36+ messages in thread
From: Paul Smith @ 2009-05-31  5:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: stable, Andrew Morton, Andi Kleen, Oleg Nesterov, Roland McGrath

coredump: Retry writes where appropriate

Core dump write operations (especially to a pipe) can be incomplete due
to signal reception or possibly recoverable partial writes.

Previously any incomplete write in the ELF core dumper caused the core
dump to stop, giving short cores in these cases.  Modify the core dumper
to retry the write where appropriate.

Signed-off-by: Paul Smith <paul@mad-scientist.net>
Cc: stable@kernel.org

---

 fs/binfmt_elf.c |   19 ++++++++++++++++++-
 1 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 40381df..26b03cc 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -1119,7 +1119,24 @@ out:
  */
 static int dump_write(struct file *file, const void *addr, int nr)
 {
-	return file->f_op->write(file, addr, nr, &file->f_pos) == nr;
+	const char *p = addr;
+	while (1) {
+		int r = file->f_op->write(file, p, nr, &file->f_pos);
+
+		if (likely(r == nr))
+			return 1;
+
+		if (r == -ERESTARTSYS || r == -EAGAIN || r == -EINTR)
+			/* Ignore signals during coredump. */
+			clear_thread_flag(TIF_SIGPENDING);
+		else if (r > 0) {
+			/* Partial write: try again with the rest. */
+			p += r;
+			nr -= r;
+		} else
+			/* Lose! */
+			return 0;
+	}
 }
 
 static int dump_seek(struct file *file, loff_t off)


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-05-31  5:33 [PATCH] coredump: Retry writes where appropriate Paul Smith
@ 2009-05-31 10:18 ` Alan Cox
  2009-05-31 14:03   ` Olivier Galibert
  2009-06-01 16:12   ` Oleg Nesterov
  0 siblings, 2 replies; 36+ messages in thread
From: Alan Cox @ 2009-05-31 10:18 UTC (permalink / raw)
  To: paul
  Cc: linux-kernel, stable, Andrew Morton, Andi Kleen, Oleg Nesterov,
	Roland McGrath

On Sun, 31 May 2009 01:33:39 -0400
Paul Smith <paul@mad-scientist.net> wrote:

> coredump: Retry writes where appropriate
> 
> Core dump write operations (especially to a pipe) can be incomplete due
> to signal reception or possibly recoverable partial writes.

NAK this

> Previously any incomplete write in the ELF core dumper caused the core
> dump to stop, giving short cores in these cases.  Modify the core dumper
> to retry the write where appropriate.

The existing behaviour is an absolute godsend when you've something like
a core dump stuck on an NFS mount or something trying to core dump to
very slow media.

In fact the signals checks were *purposefully added* some time ago.

Alan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-05-31 10:18 ` Alan Cox
@ 2009-05-31 14:03   ` Olivier Galibert
  2009-05-31 16:31     ` Alan Cox
  2009-05-31 16:56     ` Paul Smith
  2009-06-01 16:12   ` Oleg Nesterov
  1 sibling, 2 replies; 36+ messages in thread
From: Olivier Galibert @ 2009-05-31 14:03 UTC (permalink / raw)
  To: Alan Cox
  Cc: paul, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Oleg Nesterov, Roland McGrath

On Sun, May 31, 2009 at 11:18:51AM +0100, Alan Cox wrote:
> On Sun, 31 May 2009 01:33:39 -0400
> Paul Smith <paul@mad-scientist.net> wrote:
> 
> > coredump: Retry writes where appropriate
> > 
> > Core dump write operations (especially to a pipe) can be incomplete due
> > to signal reception or possibly recoverable partial writes.
> 
> NAK this
> 
> > Previously any incomplete write in the ELF core dumper caused the core
> > dump to stop, giving short cores in these cases.  Modify the core dumper
> > to retry the write where appropriate.
> 
> The existing behaviour is an absolute godsend when you've something like
> a core dump stuck on an NFS mount or something trying to core dump to
> very slow media.
> 
> In fact the signals checks were *purposefully added* some time ago.

Perhaps removing the "|| r == -EINTR" part would make both of you
happy?  He gets the reliability on pipes, you keep the interrupt on
signals.

  OG.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-05-31 14:03   ` Olivier Galibert
@ 2009-05-31 16:31     ` Alan Cox
  2009-05-31 16:49       ` Olivier Galibert
  2009-05-31 17:46       ` Paul Smith
  2009-05-31 16:56     ` Paul Smith
  1 sibling, 2 replies; 36+ messages in thread
From: Alan Cox @ 2009-05-31 16:31 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: paul, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Oleg Nesterov, Roland McGrath

O> Perhaps removing the "|| r == -EINTR" part would make both of you
> happy?  He gets the reliability on pipes, you keep the interrupt on
> signals.

How does that improve things.

There is a second problem anyway. Suppose something is causing a
continual stream of signal events - what guarantees it makes progress ?

The only source of signals during a dump should be external ones. Far
better would be to set some kind of defined signal mask during the dump
(say SIGPIPE, SIGINT, SIGQUIT) ? I agree with Paul's patch in the sense
we don't want spurious SIGIO events or similar spoiling a dump.

Alan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-05-31 16:31     ` Alan Cox
@ 2009-05-31 16:49       ` Olivier Galibert
  2009-05-31 17:46       ` Paul Smith
  1 sibling, 0 replies; 36+ messages in thread
From: Olivier Galibert @ 2009-05-31 16:49 UTC (permalink / raw)
  To: Alan Cox
  Cc: paul, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Oleg Nesterov, Roland McGrath

On Sun, May 31, 2009 at 05:31:44PM +0100, Alan Cox wrote:
> O> Perhaps removing the "|| r == -EINTR" part would make both of you
> > happy?  He gets the reliability on pipes, you keep the interrupt on
> > signals.
> 
> How does that improve things.

I can very well misunderstand the code, but if you dump to a pipe,
what you can write() in one call is limited to the pipe buffer
capacity, isn't it?  Something like 16 pages iirc.  So you get a short
write and nothing there seems to call write again.

I've also had short writes on normal filesystems (nfs at least,
reiserfs and ext3 too I seem to remember, but it was a 2.6.20-era
kernel) for writes bigger than 2G.  So I'm not sure a dump of a 2G+
process would actually work.

> There is a second problem anyway. Suppose something is causing a
> continual stream of signal events - what guarantees it makes progress ?

Can a signal end up in anything else than EINTR?

> The only source of signals during a dump should be external ones. Far
> better would be to set some kind of defined signal mask during the dump
> (say SIGPIPE, SIGINT, SIGQUIT) ? I agree with Paul's patch in the sense
> we don't want spurious SIGIO events or similar spoiling a dump.

But Paul's patch is not just about signals, it's about EAGAIN and
short writes too.

  OG.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-05-31 16:31     ` Alan Cox
  2009-05-31 16:49       ` Olivier Galibert
@ 2009-05-31 17:46       ` Paul Smith
  1 sibling, 0 replies; 36+ messages in thread
From: Paul Smith @ 2009-05-31 17:46 UTC (permalink / raw)
  To: Alan Cox
  Cc: Olivier Galibert, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Oleg Nesterov, Roland McGrath

On Sun, 2009-05-31 at 17:31 +0100, Alan Cox wrote:
> The only source of signals during a dump should be external ones. Far
> better would be to set some kind of defined signal mask during the dump
> (say SIGPIPE, SIGINT, SIGQUIT) ? I agree with Paul's patch in the sense
> we don't want spurious SIGIO events or similar spoiling a dump.

Something similar to this is what Andi Kleen first proposed.  The issue
is that it modifies the signal mask for the process before the core was
dumped, so that the mask saved in the core was not the real mask used by
the process when it took the exception.

Andi mentioned this problem and suggested we'd need to keep the original
mask around to be saved in the core.  That would involve deeper hacking
in the ELF format but maybe this is the right answer.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-05-31 14:03   ` Olivier Galibert
  2009-05-31 16:31     ` Alan Cox
@ 2009-05-31 16:56     ` Paul Smith
  1 sibling, 0 replies; 36+ messages in thread
From: Paul Smith @ 2009-05-31 16:56 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Alan Cox, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Oleg Nesterov, Roland McGrath

On Sun, 2009-05-31 at 16:03 +0200, Olivier Galibert wrote:
> On Sun, May 31, 2009 at 11:18:51AM +0100, Alan Cox wrote:
> > On Sun, 31 May 2009 01:33:39 -0400
> > Paul Smith <paul@mad-scientist.net> wrote:
> > 
> > > coredump: Retry writes where appropriate
> > > 
> > > Core dump write operations (especially to a pipe) can be incomplete due
> > > to signal reception or possibly recoverable partial writes.
> > 
> > NAK this
> > 
> > > Previously any incomplete write in the ELF core dumper caused the core
> > > dump to stop, giving short cores in these cases.  Modify the core dumper
> > > to retry the write where appropriate.
> > 
> > The existing behaviour is an absolute godsend when you've something like
> > a core dump stuck on an NFS mount or something trying to core dump to
> > very slow media.
> > 
> > In fact the signals checks were *purposefully added* some time ago.

This is what Olivier mentioned as well, and I do see the benefit in
being able to get rid of hung up coredumps.  But to me it's more
important to have reliable and robust coredumping, and I'm getting
reports of short cores on my systems at least once a week due to this
problem (the userspace applications I'm working with use signals for
certain well-defined situations, that tend to happen at around the same
time as you might expect core dumps).

> Perhaps removing the "|| r == -EINTR" part would make both of you
> happy?  He gets the reliability on pipes, you keep the interrupt on
> signals.

I'm getting back ERESTARTSYS in my environment, and it's happening
because pipe_write() detects a signal pending.  I don't think this is
due to SIGPIPE, and I'm not sure that removing EINTR will give Alan the
behavior he is looking for.

Another possibility would be to examine the signal itself and don't
retry if it's SIGKILL.  I'm too much of a kernel hacking noob to know
offhand how to find the pending signal but I can certainly figure it
out.  If it's possible, Alan, would that be an acceptable alternative?

I'm not entirely happy with this because, as was discussed in an earlier
thread, there are plenty of common idioms where you can expect to
receive a SIGKILL while you're in the middle of dumping core (lots of
userspace setups will send a few HUPs, followed by a few INTs, and if
the process is still there they send KILLs).  However, for my specific
purposes this would be sufficient.

Other ideas?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-05-31 10:18 ` Alan Cox
  2009-05-31 14:03   ` Olivier Galibert
@ 2009-06-01 16:12   ` Oleg Nesterov
  2009-06-01 16:41     ` Alan Cox
  2009-06-01 17:36     ` Paul Smith
  1 sibling, 2 replies; 36+ messages in thread
From: Oleg Nesterov @ 2009-06-01 16:12 UTC (permalink / raw)
  To: Alan Cox
  Cc: paul, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

On 05/31, Alan Cox wrote:
>
> On Sun, 31 May 2009 01:33:39 -0400
> Paul Smith <paul@mad-scientist.net> wrote:
>
> > coredump: Retry writes where appropriate
> >
> > Core dump write operations (especially to a pipe) can be incomplete due
> > to signal reception or possibly recoverable partial writes.
>
> NAK this
>
> > Previously any incomplete write in the ELF core dumper caused the core
> > dump to stop, giving short cores in these cases.  Modify the core dumper
> > to retry the write where appropriate.
>
> The existing behaviour is an absolute godsend when you've something like
> a core dump stuck on an NFS mount or something trying to core dump to
> very slow media.

I agree, we should make the coredumping interruptible.

But I don't know which signal should intterrupt. At least SIGKILL should,
I think. As for other unhandled sig_fatal() signals, I am nor sure.
I can make a patch, but first I need to know what this patch should do.
Again, please look at:

	killable/interruptible coredumps
	http://marc.info/?l=linux-kernel&m=121665710711931


And we should not change dump_write(), we should create the new helper
which can be used by all fs/binfmt_*.c.

And of course, the coredumping thread should not play with ->blocked
or ->sighand->action. This is not even needed.

Oleg.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 16:12   ` Oleg Nesterov
@ 2009-06-01 16:41     ` Alan Cox
  2009-06-01 17:11       ` Oleg Nesterov
  2009-06-01 17:36     ` Paul Smith
  1 sibling, 1 reply; 36+ messages in thread
From: Alan Cox @ 2009-06-01 16:41 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: paul, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

> I think. As for other unhandled sig_fatal() signals, I am nor sure.
> I can make a patch, but first I need to know what this patch should do.
> Again, please look at:

If you are on the command line then SIGINT/SIGQUIT would be the obvious
ones for this ?


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 16:41     ` Alan Cox
@ 2009-06-01 17:11       ` Oleg Nesterov
  2009-06-01 17:46         ` Alan Cox
  0 siblings, 1 reply; 36+ messages in thread
From: Oleg Nesterov @ 2009-06-01 17:11 UTC (permalink / raw)
  To: Alan Cox
  Cc: paul, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

On 06/01, Alan Cox wrote:
>
> > I think. As for other unhandled sig_fatal() signals, I am nor sure.
> > I can make a patch, but first I need to know what this patch should do.
> > Again, please look at:
>
> If you are on the command line then SIGINT/SIGQUIT would be the obvious
> ones for this ?

Not sure I understand. Do you mean we should treat the tty signals
specially ?

Personally, I don't think we should. If we decide that only SIGKILL
interrupts the coredumping, then I think ^C should not interrupt.

Oleg.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 17:11       ` Oleg Nesterov
@ 2009-06-01 17:46         ` Alan Cox
  2009-06-01 18:23           ` Oleg Nesterov
  0 siblings, 1 reply; 36+ messages in thread
From: Alan Cox @ 2009-06-01 17:46 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: paul, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

> > If you are on the command line then SIGINT/SIGQUIT would be the obvious
> > ones for this ?
> 
> Not sure I understand. Do you mean we should treat the tty signals
> specially ?

Yes

> Personally, I don't think we should. If we decide that only SIGKILL
> interrupts the coredumping, then I think ^C should not interrupt.

Wait until you have a remote session over ssh that core dumps a 2GB core.
Then you'll understand why being able to ^C or ^\ it is useful.

Alan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 17:46         ` Alan Cox
@ 2009-06-01 18:23           ` Oleg Nesterov
  2009-06-01 20:38             ` Roland McGrath
  0 siblings, 1 reply; 36+ messages in thread
From: Oleg Nesterov @ 2009-06-01 18:23 UTC (permalink / raw)
  To: Alan Cox
  Cc: paul, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

On 06/01, Alan Cox wrote:
>
> > > If you are on the command line then SIGINT/SIGQUIT would be the obvious
> > > ones for this ?
> >
> > Not sure I understand. Do you mean we should treat the tty signals
> > specially ?
>
> Yes
>
> > Personally, I don't think we should. If we decide that only SIGKILL
> > interrupts the coredumping, then I think ^C should not interrupt.
>
> Wait until you have a remote session over ssh that core dumps a 2GB core.
> Then you'll understand why being able to ^C or ^\ it is useful.

Sure. But you have the same problem with

	$ perl -e '$SIG{INT} = $SIG{QUIT} = IGNORE; sleep'
	^C^C^C^\^\^\

over ssh.

And what if the coredumping task already has the pending SIGINT/SIGQUIT
or blocks/ignores them?

But don't get me wrong, I do agree this is useful, and this can be
implemented. But in this case, imho kill(SIGINT) should work as well,
not just ^C.

In short, I agree in advance with any authoritative decision. But
if we add more power to ^C (compared to kill), this should be a
separate patch imho.

Oleg.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 18:23           ` Oleg Nesterov
@ 2009-06-01 20:38             ` Roland McGrath
  2009-06-01 22:32               ` Oleg Nesterov
  2009-06-03 14:05               ` Paul Smith
  0 siblings, 2 replies; 36+ messages in thread
From: Roland McGrath @ 2009-06-01 20:38 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, paul, linux-kernel, stable, Andrew Morton, Andi Kleen

IMHO it would certainly be wrong to have behavior differ based on what
particular method from userland was used to generate a signal.  I don't
think Alan was suggesting anything like that, just using ^C as shorthand
for what the user experience is that matters.  (Alan also mentioned ^\,
i.e. SIGQUIT.  I don't think it would make sense to have that interrupt
dumping--it's what you hit to request a core dump.)

I'm not really sure exactly how this has morphed over time, or how the
internal wrinkles were before that produced the observed behavior.  I
suspect it was always more by happenstance than careful attention.

Any sig_fatal() non-core signal that was sent before the core dumping
actually started already prevents the dump now.  The complete_signal()
code along with the signal_group_exit() check in zap_threads() should
ensure that.  This is as it should be: there was no guarantee of the
order in which the signals were processed, so the effect is that the
core dump signal was swallowed by the arrival of the just-fatal signal.

Aside from that, I see the following categories for newly-arriving signals.

1. More core-dump signals.  e.g., it was already crashing and you hit ^\
   or maybe just hit ^\ twice with a finger delay.  
2. Non-fatal signals (i.e. ones with handlers, stop signals).
3. Plain sig_fatal() non-core signals (e.g. SIGINT when not handled)
4. SIGKILL (an actual one from userland or oomkill, not group-exit)

#1 IMHO should not do anything at all.  
You are asking for a core dump, it's already doing it.

#2 should not do anything at all.
It's not really possible to suspend during the core dump, so unhandled,
unblocked stop signals can't do anything either.

#4 IMHO should always stop everything immediately.
That's what SIGKILL is for.  When userland generates a SIGKILL
explicitly, that says the top priority is to be gone and cease
consuming any resources ASAP.

#3 is the open question.  I don't feel strongly either way.

Whatever the decision on #3, we have a problem to fix for #1 and #2 at
least anyway.  These unblocked signals will cause TIF_SIGPENDING to be
set when dumping, either via recalc_sigpending() from dequeue_signal()
after the core signal is taken (more signals already pending), or via
signal_wake_up() from complete_signal() for newly-generated signals.
(PF_EXITING is not yet set to prevent it.)  This spuriously prevents
interruptible waits in the fs code or the pipe code or whatnot.

That looks simple to avoid just by clobbering ->real_blocked when we
start the core dump.  The dump-writing itself uses ->blocked, so that
actually won't pollute the data.  The less magical way that is obvious
would be to add a SIGNAL_GROUP_DUMPING flag that we set at the beginning
of the dumping, and make recalc_sigpending_tsk/complete_signal obey that.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 20:38             ` Roland McGrath
@ 2009-06-01 22:32               ` Oleg Nesterov
  2009-06-01 23:02                 ` Roland McGrath
  2009-06-02  8:21                 ` Alan Cox
  2009-06-03 14:05               ` Paul Smith
  1 sibling, 2 replies; 36+ messages in thread
From: Oleg Nesterov @ 2009-06-01 22:32 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Alan Cox, paul, linux-kernel, stable, Andrew Morton, Andi Kleen

On 06/01, Roland McGrath wrote:
>
> IMHO it would certainly be wrong to have behavior differ based on what
> particular method from userland was used to generate a signal.

Agreed.

> Aside from that, I see the following categories for newly-arriving signals.
>
> 1. More core-dump signals.  e.g., it was already crashing and you hit ^\
>    or maybe just hit ^\ twice with a finger delay.
> 2. Non-fatal signals (i.e. ones with handlers, stop signals).
> 3. Plain sig_fatal() non-core signals (e.g. SIGINT when not handled)
> 4. SIGKILL (an actual one from userland or oomkill, not group-exit)
>
> #1 IMHO should not do anything at all.
> You are asking for a core dump, it's already doing it.
>
> #2 should not do anything at all.
> It's not really possible to suspend during the core dump, so unhandled,
> unblocked stop signals can't do anything either.
>
> #4 IMHO should always stop everything immediately.
> That's what SIGKILL is for.  When userland generates a SIGKILL
> explicitly, that says the top priority is to be gone and cease
> consuming any resources ASAP.

Agreed.

> #3 is the open question.  I don't feel strongly either way.
>
> Whatever the decision on #3, we have a problem to fix for #1 and #2 at
> least anyway.  These unblocked signals will cause TIF_SIGPENDING to be
> set when dumping, either via recalc_sigpending() from dequeue_signal()
> after the core signal is taken (more signals already pending), or via
> signal_wake_up() from complete_signal() for newly-generated signals.
> (PF_EXITING is not yet set to prevent it.)  This spuriously prevents
> interruptible waits in the fs code or the pipe code or whatnot.
>
> That looks simple to avoid just by clobbering ->real_blocked when we
> start the core dump.

I don't think ->real_blocked is a good choice, we have to add more checks
to the signal sending path. Note that currenly it is only checked under
sig_fatal() && !SIGNAL_GROUP_EXIT.

Perhaps it is easier to change dump_write() to clear TIF_SIGPENDING
unless fatal_signal_pending(),

	int coredump_file_write(struct file *file, const void *addr, int nr)
	{
		while (nr > 0) {
			int res = file->f_op->write(file, addr, nr, &file->f_pos);

			if (res > 0) {
				nr -= res;
				continue;
			}

			if (!signal_pending(current))
				break;
			if (__fatal_signal_pending(current))
				break;
			clear_thread_flag(TIF_SIGPENDING);
		}

		return !nr;
	}

> The less magical way that is obvious
> would be to add a SIGNAL_GROUP_DUMPING flag that we set at the beginning
> of the dumping, and make recalc_sigpending_tsk/complete_signal obey that.

We do not need to change recalc_sigpending_tsk. For example, if we decide
that only SIGKILL interrupts the coredumping, than I think something like
the patch below should work. But I think we should change dump_write() to
handle the short write anyway?

Of course, this all assumes f_op->write() does not do recalc_sigpending().

Oleg.

--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -644,6 +644,8 @@ static int prepare_signal(int sig, struc
 		/*
 		 * The process is in the middle of dying, nothing to do.
 		 */
+		if ((signal->flags & SIGNAL_GROUP_DUMPING) && sig != SIGKILL)
+			return 0;
 	} else if (sig_kernel_stop(sig)) {
 		/*
 		 * This is a stop signal.  Remove SIGCONT from all queues.
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1537,6 +1537,8 @@ static inline int zap_threads(struct tas
 	if (!signal_group_exit(tsk->signal)) {
 		mm->core_state = core_state;
 		tsk->signal->group_exit_code = exit_code;
+		tsk->signal->flags |= SIGNAL_GROUP_DUMPING;
+		clear_thread_flag(TIF_SIGPENDING);
 		nr = zap_process(tsk);
 	}
 	spin_unlock_irq(&tsk->sighand->siglock);
@@ -1760,12 +1762,6 @@ void do_coredump(long signr, int exit_co
 	old_cred = override_creds(cred);
 
 	/*
-	 * Clear any false indication of pending signals that might
-	 * be seen by the filesystem code called to write the core file.
-	 */
-	clear_thread_flag(TIF_SIGPENDING);
-
-	/*
 	 * lock_kernel() because format_corename() is controlled by sysctl, which
 	 * uses lock_kernel()
 	 */


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 22:32               ` Oleg Nesterov
@ 2009-06-01 23:02                 ` Roland McGrath
  2009-06-02  0:08                   ` Oleg Nesterov
  2009-06-02  8:21                 ` Alan Cox
  1 sibling, 1 reply; 36+ messages in thread
From: Roland McGrath @ 2009-06-01 23:02 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, paul, linux-kernel, stable, Andrew Morton, Andi Kleen

> I don't think ->real_blocked is a good choice, we have to add more checks
> to the signal sending path. Note that currenly it is only checked under
> sig_fatal() && !SIGNAL_GROUP_EXIT.

No, you're right.  It's not a good idea.

> Perhaps it is easier to change dump_write() to clear TIF_SIGPENDING
> unless fatal_signal_pending(),

That is almost a separate subject, really.  Having i/o calls' waits wrongly
interrupted and then clearing TIF_SIGPENDING just seems goofy to me.  It
might not be just the ->write call, it might affect filp_open or dump_seek
or whatever.  It makes no sense to me to take the approach of fiddling
after doing the wrong thing.  Just don't do the wrong thing to begin with.

That means clear TIF_SIGPENDING and don't let it be set again by a signal
that is not meant to abort the core dump.  Why do anything but that?

It's almost certain that recalc_sigpending_tsk won't be called once dumping
has started.  But there is the possibility of recalc_sigpending_and_wake
via cancel_freezing, at least.  Seems safer to make recalc_sigpending_tsk
robust in this case.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 23:02                 ` Roland McGrath
@ 2009-06-02  0:08                   ` Oleg Nesterov
  2009-06-03  7:09                     ` Roland McGrath
  0 siblings, 1 reply; 36+ messages in thread
From: Oleg Nesterov @ 2009-06-02  0:08 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Alan Cox, paul, linux-kernel, stable, Andrew Morton, Andi Kleen

On 06/01, Roland McGrath wrote:
>
> That is almost a separate subject, really.  Having i/o calls' waits wrongly
> interrupted and then clearing TIF_SIGPENDING just seems goofy to me.

Yes, agreed. The patch I sent make the coredumping task invisible to all
signals except SIGKILL.

> But there is the possibility of recalc_sigpending_and_wake
> via cancel_freezing, at least.  Seems safer to make recalc_sigpending_tsk
> robust in this case.

Oh, I forgot about freezer...

Well, not good to complicate recalc_sigpending_tsk() for this unlikely case.
And this can't help, freezer does signal_wake_up() unconditionally.

So in fact this is another argument to check signal_pending() and clear it
in dump_write/seek.

But since the coredumping task is not freezable anyway, perhaps we should
change fake_signal_wake_up() to ignore SIGNAL_GROUP_DUMPING task.

Or we should make the coredumping freezable. This means dump_write/seek
and exit_mm() should do try_to_freeze().

In any case, the coredumping is special. If ->write() returns -ERESTART/EINTR
it assumes the return to ths user-space, this is not true for the coredump.
This means that handling the spurious signals in coredump_file_write() is
not so bad if we can't avoid this.

Oleg.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-02  0:08                   ` Oleg Nesterov
@ 2009-06-03  7:09                     ` Roland McGrath
  2009-06-04  3:15                       ` Oleg Nesterov
  0 siblings, 1 reply; 36+ messages in thread
From: Roland McGrath @ 2009-06-03  7:09 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, paul, linux-kernel, stable, Andrew Morton, Andi Kleen

> Oh, I forgot about freezer...

We all would like to.

> Well, not good to complicate recalc_sigpending_tsk() for this unlikely case.
> And this can't help, freezer does signal_wake_up() unconditionally.
> 
> So in fact this is another argument to check signal_pending() and clear it
> in dump_write/seek.

I can't agree with this at all.  IMHO it's far better to have a consistent
definition of when TIF_SIGPENDING ought to be triggered, and have
recalc_sigpending_tsk() use logic that matches the logic controlling when
to set TIF_SIGPENDING asynchronously (i.e. signal_wake_up calls).

> But since the coredumping task is not freezable anyway, perhaps we should
> change fake_signal_wake_up() to ignore SIGNAL_GROUP_DUMPING task.

That could be a long delay and a lot of i/o before suspending.

> Or we should make the coredumping freezable. This means dump_write/seek
> and exit_mm() should do try_to_freeze().

Yes, I think this is the thing to do for that issue.
(It's kind of a separate problem.)

> In any case, the coredumping is special. If ->write() returns -ERESTART/EINTR
> it assumes the return to ths user-space, this is not true for the coredump.
> This means that handling the spurious signals in coredump_file_write() is
> not so bad if we can't avoid this.

I am not so confident.  It seems far too easy to wind up with some other
way that TIF_SIGPENDING gets continually set and this loops, for example.
(This could be some day in the future when fs, driver or pipe-io code
changes somehow completely obscure.)  It's far better to have confidence
just in the signals code itself: the only things that set TIF_SIGPENDING
interlock with the logic of the only things that are expected to clear it.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-03  7:09                     ` Roland McGrath
@ 2009-06-04  3:15                       ` Oleg Nesterov
  2009-06-04 17:14                         ` Roland McGrath
  0 siblings, 1 reply; 36+ messages in thread
From: Oleg Nesterov @ 2009-06-04  3:15 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Alan Cox, paul, linux-kernel, stable, Andrew Morton, Andi Kleen

On 06/03, Roland McGrath wrote:
>
> > But since the coredumping task is not freezable anyway, perhaps we should
> > change fake_signal_wake_up() to ignore SIGNAL_GROUP_DUMPING task.
>
> That could be a long delay and a lot of i/o before suspending.
>
> > Or we should make the coredumping freezable. This means dump_write/seek
> > and exit_mm() should do try_to_freeze().
>
> Yes, I think this is the thing to do for that issue.

Fortunately, this doesn't look hard. Whatever we do, we should modify
dump_write/seek to check fatal_signal_pending() anyway. Because we can't
know if f_ops->write() pays attention to signals. This means we can just
add try_to_freeze().

As for exit_mm(), we can use freezer_do_not_count() + freezer_count()
around the "for (;;)" loop.

> > In any case, the coredumping is special. If ->write() returns -ERESTART/EINTR
> > it assumes the return to ths user-space, this is not true for the coredump.
> > This means that handling the spurious signals in coredump_file_write() is
> > not so bad if we can't avoid this.
>
> I am not so confident.  It seems far too easy to wind up with some other
> way that TIF_SIGPENDING gets continually set and this loops, for example.
> (This could be some day in the future when fs, driver or pipe-io code
> changes somehow completely obscure.)  It's far better to have confidence
> just in the signals code itself: the only things that set TIF_SIGPENDING
> interlock with the logic of the only things that are expected to clear it.

Looks like, if we introduce a difference between "really killed" tasks and
exiting/execing/coredumping tasks (as discussed in another thread), we get
this all for free.

Oleg.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-04  3:15                       ` Oleg Nesterov
@ 2009-06-04 17:14                         ` Roland McGrath
  2009-06-23 17:31                           ` Paul Smith
  0 siblings, 1 reply; 36+ messages in thread
From: Roland McGrath @ 2009-06-04 17:14 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, paul, linux-kernel, stable, Andrew Morton, Andi Kleen

> Fortunately, this doesn't look hard. Whatever we do, we should modify
> dump_write/seek to check fatal_signal_pending() anyway. Because we can't
> know if f_ops->write() pays attention to signals. 

Yes, that sounds fine.

> This means we can just add try_to_freeze().

Right.

> As for exit_mm(), we can use freezer_do_not_count() + freezer_count()
> around the "for (;;)" loop.

Ah yes, sure.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-04 17:14                         ` Roland McGrath
@ 2009-06-23 17:31                           ` Paul Smith
  2009-06-23 19:37                             ` Oleg Nesterov
  0 siblings, 1 reply; 36+ messages in thread
From: Paul Smith @ 2009-06-23 17:31 UTC (permalink / raw)
  To: Oleg Nesterov, Roland McGrath
  Cc: Alan Cox, linux-kernel, stable, Andrew Morton, Andi Kleen

On Thu, 2009-06-04 at 10:14 -0700, Roland McGrath wrote:
> > Fortunately, this doesn't look hard. Whatever we do, we should modify
> > dump_write/seek to check fatal_signal_pending() anyway. Because we can't
> > know if f_ops->write() pays attention to signals. 
> 
> Yes, that sounds fine.
> 
> > This means we can just add try_to_freeze().
> 
> Right.
> 
> > As for exit_mm(), we can use freezer_do_not_count() + freezer_count()
> > around the "for (;;)" loop.
> 
> Ah yes, sure.

Hi Oleg;

Did you have any more time to look into this?  I'm currently using my
patch and it's OK for my purposes but I'll be happy to test any proposal
you come up with, if you like, so I can drop my patch in the future.

Let me know, thanks!


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-23 17:31                           ` Paul Smith
@ 2009-06-23 19:37                             ` Oleg Nesterov
  2009-07-07 19:37                               ` Oleg Nesterov
  0 siblings, 1 reply; 36+ messages in thread
From: Oleg Nesterov @ 2009-06-23 19:37 UTC (permalink / raw)
  To: Paul Smith
  Cc: Roland McGrath, Alan Cox, linux-kernel, stable, Andrew Morton,
	Andi Kleen

On 06/23, Paul Smith wrote:
>
> Did you have any more time to look into this?  I'm currently using my
> patch and it's OK for my purposes but I'll be happy to test any proposal
> you come up with, if you like, so I can drop my patch in the future.

Thanks for reminding...

I'll try to make the patch this or next week.

Oleg.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-23 19:37                             ` Oleg Nesterov
@ 2009-07-07 19:37                               ` Oleg Nesterov
  0 siblings, 0 replies; 36+ messages in thread
From: Oleg Nesterov @ 2009-07-07 19:37 UTC (permalink / raw)
  To: Paul Smith
  Cc: Roland McGrath, Alan Cox, linux-kernel, stable, Andrew Morton,
	Andi Kleen

On 06/23, Oleg Nesterov wrote:
>
> On 06/23, Paul Smith wrote:
> >
> > Did you have any more time to look into this?  I'm currently using my
> > patch and it's OK for my purposes but I'll be happy to test any proposal
> > you come up with, if you like, so I can drop my patch in the future.
>
> Thanks for reminding...
>
> I'll try to make the patch this or next week.

Well, I need to think more.

SIGNAL_GROUP_DUMPING is not that simple. If we want to use it
in recalc_sigpending(), we can only set it after coredump_wait() succeeds.
And we need some changes in send_signal()->complete_signal() anyway, to make
sure SIGKILL does wakeup even if SIGNAL_GROUP_EXIT.

Sorry for delay, will try to make patches soon ;)

Oleg.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 22:32               ` Oleg Nesterov
  2009-06-01 23:02                 ` Roland McGrath
@ 2009-06-02  8:21                 ` Alan Cox
  2009-06-02 15:29                   ` Oleg Nesterov
  1 sibling, 1 reply; 36+ messages in thread
From: Alan Cox @ 2009-06-02  8:21 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Roland McGrath, paul, linux-kernel, stable, Andrew Morton,
	Andi Kleen

> Perhaps it is easier to change dump_write() to clear TIF_SIGPENDING
> unless fatal_signal_pending(),

If you are receiving a continuous stream of SIGIO's say then how do you
guarantee the code below will make progress ?

> 
> 	int coredump_file_write(struct file *file, const void *addr, int nr)
> 	{
> 		while (nr > 0) {
> 			int res = file->f_op->write(file, addr, nr, &file->f_pos);
> 
> 			if (res > 0) {
> 				nr -= res;
> 				continue;
> 			}
> 
> 			if (!signal_pending(current))
> 				break;
> 			if (__fatal_signal_pending(current))
> 				break;
> 			clear_thread_flag(TIF_SIGPENDING);
> 		}
> 
> 		return !nr;
> 	}
> 

> Of course, this all assumes f_op->write() does not do recalc_sigpending().

Which is itself a dangerous assumption that shouldn't be propogated.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-02  8:21                 ` Alan Cox
@ 2009-06-02 15:29                   ` Oleg Nesterov
  2009-06-03  7:15                     ` Roland McGrath
  0 siblings, 1 reply; 36+ messages in thread
From: Oleg Nesterov @ 2009-06-02 15:29 UTC (permalink / raw)
  To: Alan Cox
  Cc: Roland McGrath, paul, linux-kernel, stable, Andrew Morton,
	Andi Kleen

On 06/02, Alan Cox wrote:
>
> > Perhaps it is easier to change dump_write() to clear TIF_SIGPENDING
> > unless fatal_signal_pending(),
>
> If you are receiving a continuous stream of SIGIO's say then how do you
> guarantee the code below will make progress ?

Yes, the second SIGIO has no effect.

> > 	int coredump_file_write(struct file *file, const void *addr, int nr)
> > 	{
> > 		while (nr > 0) {
> > 			int res = file->f_op->write(file, addr, nr, &file->f_pos);
> >
> > 			if (res > 0) {
> > 				nr -= res;
> > 				continue;
> > 			}
> >
> > 			if (!signal_pending(current))
> > 				break;
> > 			if (__fatal_signal_pending(current))
> > 				break;
> > 			clear_thread_flag(TIF_SIGPENDING);
> > 		}
> >
> > 		return !nr;
> > 	}
> >
>
> > Of course, this all assumes f_op->write() does not do recalc_sigpending().
>
> Which is itself a dangerous assumption that shouldn't be propogated.

I don't think this assumption is dangerous. Why should ->write() call
recalc_sigpending() ? It should not be called outside of signal code.

But yes, if we add SIGNAL_GROUP_DUMPING we can change recalc_sigpending_tsk().
Just I don't like the idea to slow down / complicate this helper for the
very special case.


Hmm. Does ->core_dump() report the state of ->sighand->action? I can't
find any usage. Perhaps we can just ignore all signals except SIGKILL ?

Oleg.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-02 15:29                   ` Oleg Nesterov
@ 2009-06-03  7:15                     ` Roland McGrath
  0 siblings, 0 replies; 36+ messages in thread
From: Roland McGrath @ 2009-06-03  7:15 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, paul, linux-kernel, stable, Andrew Morton, Andi Kleen

> Hmm. Does ->core_dump() report the state of ->sighand->action? I can't
> find any usage. Perhaps we can just ignore all signals except SIGKILL ?

I don't think anything uses that information.  But it might one day,
or it might matter in some other arcane way (/proc/pid/status during
the dump is the only way I can think of to notice, but who knows).

I think it's better just not to get into any kludges that clobber
anything that records actual user-visible state that we normally
read out to userland in any fashion (even if none is known to do so
at dumping time or later).  It's much more natural just to use pure
internal kernel state such as signal->flags for anything like this.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 20:38             ` Roland McGrath
  2009-06-01 22:32               ` Oleg Nesterov
@ 2009-06-03 14:05               ` Paul Smith
  1 sibling, 0 replies; 36+ messages in thread
From: Paul Smith @ 2009-06-03 14:05 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Oleg Nesterov, Alan Cox, linux-kernel, stable, Andrew Morton,
	Andi Kleen

On Mon, 2009-06-01 at 13:38 -0700, Roland McGrath wrote:
> 1. More core-dump signals.  e.g., it was already crashing and you hit ^\
>    or maybe just hit ^\ twice with a finger delay.  
> 2. Non-fatal signals (i.e. ones with handlers, stop signals).
> 3. Plain sig_fatal() non-core signals (e.g. SIGINT when not handled)
> 4. SIGKILL (an actual one from userland or oomkill, not group-exit)
> 
> #1 IMHO should not do anything at all.  
> You are asking for a core dump, it's already doing it.
> 
> #2 should not do anything at all.
> It's not really possible to suspend during the core dump, so unhandled,
> unblocked stop signals can't do anything either.
> 
> #4 IMHO should always stop everything immediately.
> That's what SIGKILL is for.  When userland generates a SIGKILL
> explicitly, that says the top priority is to be gone and cease
> consuming any resources ASAP.
> 
> #3 is the open question.  I don't feel strongly either way.

Thanks Roland.  This is a great summary and lends clarity to the
discussion.

Actually I'm quite happy with the above for #'s 1, 2, and 4.

I've already stated my preference that #3 should behave like #2, but
certainly people can disagree on this and I understand that some would
like it to behave as #4.  Best case is this can be configured or, at
least if it's documented clearly userspace applications can code
defensively by masking those signals (this has minor annoyances but...)

Unfortunately the discussion you and Oleg are having shows me how little
I know about this area of the kernel and what a bad idea it would be for
me to try to get this right on my own :-).  However, I'm happy to test
patches, comment on solutions, etc.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 16:12   ` Oleg Nesterov
  2009-06-01 16:41     ` Alan Cox
@ 2009-06-01 17:36     ` Paul Smith
  2009-06-01 17:49       ` Alan Cox
  1 sibling, 1 reply; 36+ messages in thread
From: Paul Smith @ 2009-06-01 17:36 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Alan Cox, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

On Mon, 2009-06-01 at 18:12 +0200, Oleg Nesterov wrote:
> I agree, we should make the coredumping interruptible.
> 
> But I don't know which signal should intterrupt. At least SIGKILL should,
> I think. As for other unhandled sig_fatal() signals, I am nor sure.
> I can make a patch, but first I need to know what this patch should do.
> Again, please look at:
> 
> 	killable/interruptible coredumps
> 	http://marc.info/?l=linux-kernel&m=121665710711931

Ideally interrupting a core dump is something that should only ever be
done because you wanted to actually interrupt the core, and would never
happen as a side effect of some other behavior that may have been
intended for the process before it dumped core.

In my setup, which is more like an embedded system where I can reboot
the device easily, I'd rather have the core dump hang if I happen to try
to write it to an unavailable resource than to lose cores if someone
sends an errant signal to the process (assuming these are the only two
choices).  I realize that opinions about this differ based on the
purpose of the system (desktops and many types of servers probably would
rather have their pages back and don't care so much about cores).

My preference would be that no signal would ever cancel a core and there
would be some completely out-of-band method for this (something like
setting a flag via /proc/<pid> or similar) to be used if it was
necessary.  However, I can't justify that complexity.  So, SIGKILL seems
like a reasonable compromise.

On the other hand, IMO all other signals, including SIGINT and SIGQUIT,
should be ignored during core dumping.  Allowing SIGKILL gives a method
for getting rid of core dumps in the relatively rare situation where
people want/need to do so, and I don't see any real benefit to adding
more signals to the list of things you can't do if you want robust
cores.  Isn't one enough?

My $0.02.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 17:36     ` Paul Smith
@ 2009-06-01 17:49       ` Alan Cox
  2009-06-01 18:39         ` Paul Smith
  0 siblings, 1 reply; 36+ messages in thread
From: Alan Cox @ 2009-06-01 17:49 UTC (permalink / raw)
  To: paul
  Cc: Oleg Nesterov, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

> On the other hand, IMO all other signals, including SIGINT and SIGQUIT,
> should be ignored during core dumping.  Allowing SIGKILL gives a method
> for getting rid of core dumps in the relatively rare situation where
> people want/need to do so, and I don't see any real benefit to adding
> more signals to the list of things you can't do if you want robust
> cores.  Isn't one enough?

I also want usability. SIGINT/SIGQUIT are never sent except by user
requests to terminate a process so they can safely be allowed. If the
alternatives are the status quo or SIGKILL only then I'd favour the
status quo particularly having experienced the alternatives on some old
Unix systems.

Alan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 17:49       ` Alan Cox
@ 2009-06-01 18:39         ` Paul Smith
  2009-06-01 19:02           ` Alan Cox
  0 siblings, 1 reply; 36+ messages in thread
From: Paul Smith @ 2009-06-01 18:39 UTC (permalink / raw)
  To: Alan Cox
  Cc: Oleg Nesterov, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

On Mon, 2009-06-01 at 18:49 +0100, Alan Cox wrote:
> > On the other hand, IMO all other signals, including SIGINT and SIGQUIT,
> > should be ignored during core dumping.  Allowing SIGKILL gives a method
> > for getting rid of core dumps in the relatively rare situation where
> > people want/need to do so, and I don't see any real benefit to adding
> > more signals to the list of things you can't do if you want robust
> > cores.  Isn't one enough?
> 
> I also want usability. SIGINT/SIGQUIT are never sent except by user
> requests to terminate a process so they can safely be allowed. If the
> alternatives are the status quo or SIGKILL only then I'd favour the
> status quo particularly having experienced the alternatives on some old
> Unix systems.

SIGINT/SIGQUIT are sent all the time in situations where the user might
not want the core dump to be canceled.  This is what I meant by "wanted
to actually interrupt the core"; it implies the user knows that a core
is being dumped and explicitly decides they do not want to have that
happen in this situation and takes some affirmative action to stop it.

If a program seems to be unresponsive the user could ^C, without
realizing that it was really dumping core.  Now when they are asked to
produce the core so the problem can be debugged, they can't do it.  Or,
a worker process might appear unresponsive due to a core being dumped
and the parent would decide to shoot it with SIGINT based on various
timeouts etc.  Again we have no core available.

If the user has problems with coredumps there are all sorts of ways to
manage that.  You can disable core dumps altogether via ulimit.  You can
set core_pattern to dump to a fully-qualified pathname on faster media
instead of whatever working directory you're using.

Or, with this change, you can kill -9 the PID that's dumping core.

These things seem to me to provide a lot of usability features.

On the other hand there's no way to ensure full, reliable core dumps
with today's behavior.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 18:39         ` Paul Smith
@ 2009-06-01 19:02           ` Alan Cox
  2009-06-01 19:09             ` Andi Kleen
  2009-06-01 19:51             ` Paul Smith
  0 siblings, 2 replies; 36+ messages in thread
From: Alan Cox @ 2009-06-01 19:02 UTC (permalink / raw)
  To: paul
  Cc: Oleg Nesterov, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

> If a program seems to be unresponsive the user could ^C, without
> realizing that it was really dumping core.  Now when they are asked to
> produce the core so the problem can be debugged, they can't do it.  Or,

and get their prompt back, which is probably why they are banging ^C. If
they didn't want their prompt back at that point they'd still be
wondering why nothing was occuring at the point it said (core dumped)

Alan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 19:02           ` Alan Cox
@ 2009-06-01 19:09             ` Andi Kleen
  2009-06-01 19:06               ` Alan Cox
  2009-06-01 19:51             ` Paul Smith
  1 sibling, 1 reply; 36+ messages in thread
From: Andi Kleen @ 2009-06-01 19:09 UTC (permalink / raw)
  To: Alan Cox
  Cc: paul, Oleg Nesterov, linux-kernel, stable, Andrew Morton,
	Andi Kleen, Roland McGrath

On Mon, Jun 01, 2009 at 08:02:32PM +0100, Alan Cox wrote:
> > If a program seems to be unresponsive the user could ^C, without
> > realizing that it was really dumping core.  Now when they are asked to
> > produce the core so the problem can be debugged, they can't do it.  Or,
> 
> and get their prompt back, which is probably why they are banging ^C. If
> they didn't want their prompt back at that point they'd still be
> wondering why nothing was occuring at the point it said (core dumped)

Maybe we need a background core dump?

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 19:09             ` Andi Kleen
@ 2009-06-01 19:06               ` Alan Cox
  2009-06-01 19:14                 ` Andi Kleen
  0 siblings, 1 reply; 36+ messages in thread
From: Alan Cox @ 2009-06-01 19:06 UTC (permalink / raw)
  To: Andi Kleen
  Cc: paul, Oleg Nesterov, linux-kernel, stable, Andrew Morton,
	Andi Kleen, Roland McGrath

On Mon, 1 Jun 2009 21:09:14 +0200
Andi Kleen <andi@firstfloor.org> wrote:

> On Mon, Jun 01, 2009 at 08:02:32PM +0100, Alan Cox wrote:
> > > If a program seems to be unresponsive the user could ^C, without
> > > realizing that it was really dumping core.  Now when they are asked to
> > > produce the core so the problem can be debugged, they can't do it.  Or,
> > 
> > and get their prompt back, which is probably why they are banging ^C. If
> > they didn't want their prompt back at that point they'd still be
> > wondering why nothing was occuring at the point it said (core dumped)
> 
> Maybe we need a background core dump?

You can pretty much implement that via the pipe handler if you care. Just
buffer aggressively.

For the general case however programs assume that when wait() returns
indicating the core dump occurred that they can immediately access the
dump (eg bug-buddy in Gnome)

Alan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 19:06               ` Alan Cox
@ 2009-06-01 19:14                 ` Andi Kleen
  0 siblings, 0 replies; 36+ messages in thread
From: Andi Kleen @ 2009-06-01 19:14 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andi Kleen, paul, Oleg Nesterov, linux-kernel, stable,
	Andrew Morton, Roland McGrath

On Mon, Jun 01, 2009 at 08:06:30PM +0100, Alan Cox wrote:
> On Mon, 1 Jun 2009 21:09:14 +0200
> Andi Kleen <andi@firstfloor.org> wrote:
> 
> > On Mon, Jun 01, 2009 at 08:02:32PM +0100, Alan Cox wrote:
> > > > If a program seems to be unresponsive the user could ^C, without
> > > > realizing that it was really dumping core.  Now when they are asked to
> > > > produce the core so the problem can be debugged, they can't do it.  Or,
> > > 
> > > and get their prompt back, which is probably why they are banging ^C. If
> > > they didn't want their prompt back at that point they'd still be
> > > wondering why nothing was occuring at the point it said (core dumped)
> > 
> > Maybe we need a background core dump?
> 
> You can pretty much implement that via the pipe handler if you care. Just
> buffer aggressively.
> 
> For the general case however programs assume that when wait() returns
> indicating the core dump occurred that they can immediately access the
> dump (eg bug-buddy in Gnome)

Then set a advisory lock on the dump... no seriously:

With full signal handling during dump they would probably get a lot of time
a partial dump at best, because it's common to set signals in a row. 
I agree with Paul on that that that is unfortunate at best.

Perhaps that's something that just needs a sysctl.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 19:02           ` Alan Cox
  2009-06-01 19:09             ` Andi Kleen
@ 2009-06-01 19:51             ` Paul Smith
  2009-06-01 20:20               ` Oleg Nesterov
  2009-06-01 21:34               ` Alan Cox
  1 sibling, 2 replies; 36+ messages in thread
From: Paul Smith @ 2009-06-01 19:51 UTC (permalink / raw)
  To: Alan Cox
  Cc: Oleg Nesterov, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

On Mon, 2009-06-01 at 20:02 +0100, Alan Cox wrote:
> > If a program seems to be unresponsive the user could ^C, without
> > realizing that it was really dumping core.  Now when they are asked to
> > produce the core so the problem can be debugged, they can't do it.  Or,
> 
> and get their prompt back, which is probably why they are banging ^C. If
> they didn't want their prompt back at that point they'd still be
> wondering why nothing was occuring at the point it said (core dumped)

True.  My concern is that non-interactive, non-user controlled processes
seem to be getting thrown out with the bathwater here in the search for
the ultimate ease-of-use for interactive users.  SIGINT is not just a
user signal.

If it's interactive, can't the user ^Z (SIGSTOP) the process being
dumped, then kill -9 %1?  Does SIGSTOP stop a process that's dumping
core?  If this works it's not as simple as ^C, but I find myself doing
that all the time for processes which are catching SIGINT, as Oleg
points out.

Saying that SIGSTOP stops a core dump, SIGCONT continues it, SIGKILL
cancels it, and everything else is ignored would be just fine with me.

Yes, you need a shell with job control but... at some point we have to
just say it is what it is!  Core dumps are not just annoying time/disk
space wasters, they have real value; a good core dump can save tens of
thousands of dollars or more in support and development costs.  We need
(a way for) them to be reliable, even if it costs some interactive
ease-of-use.

Anyway, that's my opinion :-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 19:51             ` Paul Smith
@ 2009-06-01 20:20               ` Oleg Nesterov
  2009-06-01 21:34               ` Alan Cox
  1 sibling, 0 replies; 36+ messages in thread
From: Oleg Nesterov @ 2009-06-01 20:20 UTC (permalink / raw)
  To: Paul Smith
  Cc: Alan Cox, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

On 06/01, Paul Smith wrote:
>
> If it's interactive, can't the user ^Z (SIGSTOP) the process being
> dumped,

No, this is very hard to implement.

Oleg.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] coredump: Retry writes where appropriate
  2009-06-01 19:51             ` Paul Smith
  2009-06-01 20:20               ` Oleg Nesterov
@ 2009-06-01 21:34               ` Alan Cox
  1 sibling, 0 replies; 36+ messages in thread
From: Alan Cox @ 2009-06-01 21:34 UTC (permalink / raw)
  To: paul
  Cc: Oleg Nesterov, linux-kernel, stable, Andrew Morton, Andi Kleen,
	Roland McGrath

> If it's interactive, can't the user ^Z (SIGSTOP)

^Z is SIGTSTP but that would be one way to implement it although I think
it might need a bit more work than just setting masks.

> Yes, you need a shell with job control but... at some point we have to
> just say it is what it is!  Core dumps are not just annoying time/disk
> space wasters, they have real value; a good core dump can save tens of
> thousands of dollars or more in support and development costs.  We need
> (a way for) them to be reliable, even if it costs some interactive
> ease-of-use.
> 
> Anyway, that's my opinion :-)

And there we get into policy and preference rather than technical debate
- no right answers 8(

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2009-07-07 19:41 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-31  5:33 [PATCH] coredump: Retry writes where appropriate Paul Smith
2009-05-31 10:18 ` Alan Cox
2009-05-31 14:03   ` Olivier Galibert
2009-05-31 16:31     ` Alan Cox
2009-05-31 16:49       ` Olivier Galibert
2009-05-31 17:46       ` Paul Smith
2009-05-31 16:56     ` Paul Smith
2009-06-01 16:12   ` Oleg Nesterov
2009-06-01 16:41     ` Alan Cox
2009-06-01 17:11       ` Oleg Nesterov
2009-06-01 17:46         ` Alan Cox
2009-06-01 18:23           ` Oleg Nesterov
2009-06-01 20:38             ` Roland McGrath
2009-06-01 22:32               ` Oleg Nesterov
2009-06-01 23:02                 ` Roland McGrath
2009-06-02  0:08                   ` Oleg Nesterov
2009-06-03  7:09                     ` Roland McGrath
2009-06-04  3:15                       ` Oleg Nesterov
2009-06-04 17:14                         ` Roland McGrath
2009-06-23 17:31                           ` Paul Smith
2009-06-23 19:37                             ` Oleg Nesterov
2009-07-07 19:37                               ` Oleg Nesterov
2009-06-02  8:21                 ` Alan Cox
2009-06-02 15:29                   ` Oleg Nesterov
2009-06-03  7:15                     ` Roland McGrath
2009-06-03 14:05               ` Paul Smith
2009-06-01 17:36     ` Paul Smith
2009-06-01 17:49       ` Alan Cox
2009-06-01 18:39         ` Paul Smith
2009-06-01 19:02           ` Alan Cox
2009-06-01 19:09             ` Andi Kleen
2009-06-01 19:06               ` Alan Cox
2009-06-01 19:14                 ` Andi Kleen
2009-06-01 19:51             ` Paul Smith
2009-06-01 20:20               ` Oleg Nesterov
2009-06-01 21:34               ` Alan Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox