[PATCH 1/1] signal: make group kill signal fatal

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/1] signal: make group kill signal fatal
@ 2009-05-24 20:47 Jiri Slaby
  2009-05-25  0:07 ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Slaby @ 2009-05-24 20:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: ebiederm, oleg, roland, linux-kernel, Jiri Slaby, Matthew Wilcox

__fatal_signal_pending() returns now true only for a non-group sent
sigkill, i. e. for example tgkill, send_sig...

Use sigkill_pending() in __fatal_signal_pending() which adds a test also
for shared_pending queue.

Also grab siglock in __fatal_signal_pending().

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Matthew Wilcox <matthew@wil.cx>
---
 kernel/signal.c |   28 +++++++++++++++++-----------
 1 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index e964ad9..a12897a 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -116,6 +116,16 @@ static inline int has_pending_signals(sigset_t *signal, sigset_t *blocked)
 	return ready !=	0;
 }
 
+/*
+ * Return nonzero if there is a SIGKILL that should be waking us up.
+ * Called with the siglock held.
+ */
+static int sigkill_pending(struct task_struct *tsk)
+{
+	return	sigismember(&tsk->pending.signal, SIGKILL) ||
+		sigismember(&tsk->signal->shared_pending.signal, SIGKILL);
+}
+
 #define PENDING(p,b) has_pending_signals(&(p)->signal, (b))
 
 static int recalc_sigpending_tsk(struct task_struct *t)
@@ -1033,7 +1043,13 @@ void zap_other_threads(struct task_struct *p)
 
 int __fatal_signal_pending(struct task_struct *tsk)
 {
-	return sigismember(&tsk->pending.signal, SIGKILL);
+	int ret;
+
+	spin_lock_irq(&tsk->sighand->siglock);
+	ret = sigkill_pending(tsk);
+	spin_unlock_irq(&tsk->sighand->siglock);
+
+	return ret;
 }
 EXPORT_SYMBOL(__fatal_signal_pending);
 
@@ -1549,16 +1565,6 @@ static inline int may_ptrace_stop(void)
 }
 
 /*
- * Return nonzero if there is a SIGKILL that should be waking us up.
- * Called with the siglock held.
- */
-static int sigkill_pending(struct task_struct *tsk)
-{
-	return	sigismember(&tsk->pending.signal, SIGKILL) ||
-		sigismember(&tsk->signal->shared_pending.signal, SIGKILL);
-}
-
-/*
  * This must be called with current->sighand->siglock held.
  *
  * This should be the path for all ptrace stops.
-- 
1.6.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] signal: make group kill signal fatal
  2009-05-24 20:47 [PATCH 1/1] signal: make group kill signal fatal Jiri Slaby
@ 2009-05-25  0:07 ` Oleg Nesterov
  2009-05-25 16:21   ` Jiri Slaby
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2009-05-25  0:07 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Andrew Morton, ebiederm, roland, linux-kernel, Matthew Wilcox

On 05/24, Jiri Slaby wrote:
>
> __fatal_signal_pending() returns now true only for a non-group sent
> sigkill, i. e. for example tgkill, send_sig...

No. Please look at complete_signal(). If we queue a fatal signal,
we always add SIGKILL to any thread.

> Use sigkill_pending()

Please do not use it, it should die.

> in __fatal_signal_pending() which adds a test also
> for shared_pending queue.

See above. Afaics this is not needed.

> Also grab siglock in __fatal_signal_pending().

This is wrong. It can be called when the task has already died
and its ->signal == NULL.

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] signal: make group kill signal fatal
  2009-05-25  0:07 ` Oleg Nesterov
@ 2009-05-25 16:21   ` Jiri Slaby
  2009-05-25 17:20     ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Slaby @ 2009-05-25 16:21 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, ebiederm, roland, linux-kernel, Matthew Wilcox

On 05/25/2009 02:07 AM, Oleg Nesterov wrote:
> On 05/24, Jiri Slaby wrote:
>>
>> __fatal_signal_pending() returns now true only for a non-group sent
>> sigkill, i. e. for example tgkill, send_sig...
> 
> No. Please look at complete_signal(). If we queue a fatal signal,
> we always add SIGKILL to any thread.

Ah, thanks. But it looks like it doesn't work well in some cases.
Consider this kernel code:

#include <linux/fs.h>
#include <linux/miscdevice.h>
#include <linux/module.h>
#include <linux/poll.h>
#include <linux/sched.h>

static int release(struct inode *ino, struct file *file)
{
        unsigned int a;
        printk(KERN_DEBUG "fst\n");
        for (a = 0; a < 10; a++) {
                printk(KERN_DEBUG "%s: SP=%u FSP=%u pend=%.16lx
shpend=%.16lx\n",
                                __func__,
                                signal_pending(current),
                                fatal_signal_pending(current),
                                current->pending.signal.sig[0],

current->signal->shared_pending.signal.sig[0]);
                if (fatal_signal_pending(current))
                        break;
                msleep(1000);
        }
        printk(KERN_DEBUG "done\n");
        return 0;
}

static DECLARE_WAIT_QUEUE_HEAD(wq);

static unsigned int poll(struct file *file, struct poll_table_struct *p)
{
        poll_wait(file, &wq, p);

        return 0;
}

static const struct file_operations fops = {
        .owner = THIS_MODULE,
        .release = release,
        .poll = poll,
};

static struct miscdevice m = {
        .minor = MISC_DYNAMIC_MINOR,
        .name = "m",
        .fops = &fops,
};

static int init1(void)
{
        return misc_register(&m);
}

static void exit1(void)
{
        misc_deregister(&m);
}

module_init(init1);
module_exit(exit1);

MODULE_LICENSE("GPL");






----------------------------------------------
And this user code:
#include <err.h>
#include <fcntl.h>
#include <poll.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>

int main(int argc, char **argv)
{
        struct pollfd fds = { .events = POLLIN };
        int fd;

        fd = open("/dev/m", O_RDONLY);
        if (fd < 0)
                err(1, "open");
        fds.fd = fd;
        if (poll(&fds, 1, -1) < 0)
                err(2, "poll");
        close(fd);
        return 0;
}






----------------------------------------------
It outputs:
fst
release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100
release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100
release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100
release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100
release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100
release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100
release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100
release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100
release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100
release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100
done

I.e. it won't get propagated to current->pending. Don't you have idea
why? If the poll isn't there, it works well.

glibc 2.9, x86_64

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] signal: make group kill signal fatal
  2009-05-25 16:21   ` Jiri Slaby
@ 2009-05-25 17:20     ` Oleg Nesterov
  2009-05-25 18:15       ` Jiri Slaby
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2009-05-25 17:20 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Andrew Morton, ebiederm, roland, linux-kernel, Matthew Wilcox

On 05/25, Jiri Slaby wrote:
>
> On 05/25/2009 02:07 AM, Oleg Nesterov wrote:
> > On 05/24, Jiri Slaby wrote:
> >>
> >> __fatal_signal_pending() returns now true only for a non-group sent
> >> sigkill, i. e. for example tgkill, send_sig...
> >
> > No. Please look at complete_signal(). If we queue a fatal signal,
> > we always add SIGKILL to any thread.
>
> Ah, thanks. But it looks like it doesn't work well in some cases.
> Consider this kernel code:
>
> ...
> static int release(struct inode *ino, struct file *file)
> {
>         unsigned int a;
>         printk(KERN_DEBUG "fst\n");
>         for (a = 0; a < 10; a++) {
>                 printk(KERN_DEBUG "%s: SP=%u FSP=%u pend=%.16lx
> shpend=%.16lx\n",
>                                 __func__,
>                                 signal_pending(current),
>                                 fatal_signal_pending(current),
>                                 current->pending.signal.sig[0],
>
> current->signal->shared_pending.signal.sig[0]);
>
> ...
>
> int main(int argc, char **argv)
> {
>         struct pollfd fds = { .events = POLLIN };
>         int fd;
>
>         fd = open("/dev/m", O_RDONLY);
>         if (fd < 0)
>                 err(1, "open");
>         fds.fd = fd;
>         if (poll(&fds, 1, -1) < 0)
>                 err(2, "poll");
>         close(fd);
>         return 0;
> }
>
> ----------------------------------------------
> It outputs:
> fst
> release: SP=1 FSP=0 pend=0000000000000000 shpend=0000000000000100

Because (I guess) ->release() is called by do_exit()->close_files(),
by this time the private SIGKILL is already dequeued.

If you kill this program, poll() never returns to the user-space.

> If the poll isn't there, it works well.

Hmm. this is strange. Do you mean that if this program does
sleep(10000) (or something else) instead of poll() above, it
prints pend != 0 ?

Note. There are known issues with fatal_signal_pending() in exit()
path. But in any case, we should not check ->shared_pending.
And even if ->signal != NULL we must not use ->siglock. Because
schedule() checks fatal_signal_pending() under rq->lock, we can
deadlock. Also, the fact that SIGKILL is actually pending is the
current implementation detail, probably this will be changed.

__fatal_signal_pending() could check SIGNAL_GROUP_EXIT, but we
can't do this now, ->signal can be NULL. Actually signal_group_exit()
is more correct.

And. Why do you need fatal_signal_pending() ? It is special,
should be used by things like wait_event_killable().

Oleg.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] signal: make group kill signal fatal
  2009-05-25 17:20     ` Oleg Nesterov
@ 2009-05-25 18:15       ` Jiri Slaby
  2009-05-25 22:51         ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Slaby @ 2009-05-25 18:15 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, ebiederm, roland, linux-kernel, Matthew Wilcox

On 05/25/2009 07:20 PM, Oleg Nesterov wrote:
> On 05/25, Jiri Slaby wrote:
>> If the poll isn't there, it works well.
> 
> Hmm. this is strange. Do you mean that if this program does
> sleep(10000) (or something else) instead of poll() above, it
> prints pend != 0 ?

No, only when there is nothing, i.e. when it directly calls close. It's
consistent with what you wrote. When there is sleep(), it works the same
as the poll case.

> And. Why do you need fatal_signal_pending() ? It is special,
> should be used by things like wait_event_killable().

I need to wait for a device to finish its work in last release, but also
want to allow user to kill the waiting by SIGKILL if he thinks the
device locked up (this is pretty usual for that particular device). If I
use wait_event_killable, I end up with this.

Well, there is a STOP ioctl and a user should catch sigterm/int/quit and
call that ioctl and bail out. But users are weird and get unkillable
processes.

Anyway thanks for the clarification.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] signal: make group kill signal fatal
  2009-05-25 18:15       ` Jiri Slaby
@ 2009-05-25 22:51         ` Oleg Nesterov
  2009-06-02 12:54           ` Jiri Slaby
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2009-05-25 22:51 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Andrew Morton, ebiederm, roland, linux-kernel, Matthew Wilcox

On 05/25, Jiri Slaby wrote:
>
> On 05/25/2009 07:20 PM, Oleg Nesterov wrote:
> > On 05/25, Jiri Slaby wrote:
> >> If the poll isn't there, it works well.
> >
> > Hmm. this is strange. Do you mean that if this program does
> > sleep(10000) (or something else) instead of poll() above, it
> > prints pend != 0 ?
>
> No, only when there is nothing, i.e. when it directly calls close. It's
> consistent with what you wrote. When there is sleep(), it works the same
> as the poll case.

Good ;)

> > And. Why do you need fatal_signal_pending() ? It is special,
> > should be used by things like wait_event_killable().
>
> I need to wait for a device to finish its work in last release, but also
> want to allow user to kill the waiting by SIGKILL if he thinks the
> device locked up (this is pretty usual for that particular device). If I
> use wait_event_killable, I end up with this.

Heh. In this case you have another (long-standing) issue, please note
the "if (p->flags & PF_EXITING)" check in wants_signal().

There is no guarantee the signal will wake up the exiting task task.
Even SIGKILL, even if you use wait_event_interruptible() instead of
_killable.

Oleg.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] signal: make group kill signal fatal
  2009-05-25 22:51         ` Oleg Nesterov
@ 2009-06-02 12:54           ` Jiri Slaby
  2009-06-02 14:50             ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Slaby @ 2009-06-02 12:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, ebiederm, roland, linux-kernel, Matthew Wilcox

On 05/26/2009 12:51 AM, Oleg Nesterov wrote:
> Heh. In this case you have another (long-standing) issue, please note
> the "if (p->flags & PF_EXITING)" check in wants_signal().
> 
> There is no guarantee the signal will wake up the exiting task task.
> Even SIGKILL, even if you use wait_event_interruptible() instead of
> _killable.

Last question, doesn't wait_event_interruptible return immediately in
this case? signal_pending returns true due to non-captured signal which
killed the application and hence we are in the .release under these
special circumstances. I think this is not much expected behavior, is
it? Shouldn't be that signal dequeued/cleared instead?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] signal: make group kill signal fatal
  2009-06-02 12:54           ` Jiri Slaby
@ 2009-06-02 14:50             ` Oleg Nesterov
  2009-06-03  1:52               ` Roland McGrath
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Nesterov @ 2009-06-02 14:50 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Andrew Morton, ebiederm, roland, linux-kernel, Matthew Wilcox

On 06/02, Jiri Slaby wrote:
>
> On 05/26/2009 12:51 AM, Oleg Nesterov wrote:
> > Heh. In this case you have another (long-standing) issue, please note
> > the "if (p->flags & PF_EXITING)" check in wants_signal().
> >
> > There is no guarantee the signal will wake up the exiting task task.
> > Even SIGKILL, even if you use wait_event_interruptible() instead of
> > _killable.
>
> Last question, doesn't wait_event_interruptible return immediately in
> this case? signal_pending returns true

Yes, if a thread exits with the pending signal, then of course interruptible
wait doesn work.

> due to non-captured signal which
> killed the application

This signal can be already dequeued, but not necessary. Most probably,
when the process is killed by group the fatal signal, each thread will
exit with the shared signal pending.

> I think this is not much expected behavior, is
> it?

Well. I don't know. I'd say this is expected, but I don't think this
was specially designed ;)

> Shouldn't be that signal dequeued/cleared instead?

We can't. Think about the multithreads program. Some thread exits,
and we have a shared signal. We must not dequeue it.

We can clear TIF_SIGPENDING, and we can change recalc_sigpending_xxx()
to take PF_EXITING into account (or change their callers), but this
needs changes. And I am not sure this will right.

Oleg.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] signal: make group kill signal fatal
  2009-06-02 14:50             ` Oleg Nesterov
@ 2009-06-03  1:52               ` Roland McGrath
  2009-06-04  2:27                 ` Oleg Nesterov
  0 siblings, 1 reply; 10+ messages in thread
From: Roland McGrath @ 2009-06-03  1:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Jiri Slaby, Andrew Morton, ebiederm, linux-kernel, Matthew Wilcox

> > > Heh. In this case you have another (long-standing) issue, please note
> > > the "if (p->flags & PF_EXITING)" check in wants_signal().

Hmm.  wants_signal():

	if (p->flags & PF_EXITING)
		return 0;
	if (sig == SIGKILL)
		return 1;

Perhaps we should reverse the order of those two?

But also I'm now reminded that complete_signal() short-circuits for the
single-threaded case and never does the sig_fatal() case.

This means a single-threaded process will have SIGKILL in shared_pending
but not in its own pending so __fatal_signal_pending() will be false, no?  

I'm also now wondering if in some of our recent signals discussions we have
been assuming that SIGNAL_GROUP_EXIT is set when a fatal signal is pending.
We might be leaving some other unintended hole since that's not really true.

Probably we should just fiddle complete_signal() to do that stuff for the
single-threaded case too.  (That obviates the wants_signal change above.)

> Yes, if a thread exits with the pending signal, then of course interruptible
> wait doesn work.

Along the same lines of the recent core dump discussion, I think it would
be proper to fix this so TIF_SIGPENDING isn't left set (nor is newly set)
by a signal that won't affect it later.

> We can clear TIF_SIGPENDING, and we can change recalc_sigpending_xxx()
> to take PF_EXITING into account (or change their callers), but this
> needs changes. And I am not sure this will right.

I think we want recalc_sigpending_tsk to be consistent with wants_signal
and the other conditions controlling signal_wake_up calls.  But indeed we
need to think through any ramifications carefully.

Thanks,
Roland

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/1] signal: make group kill signal fatal
  2009-06-03  1:52               ` Roland McGrath
@ 2009-06-04  2:27                 ` Oleg Nesterov
  0 siblings, 0 replies; 10+ messages in thread
From: Oleg Nesterov @ 2009-06-04  2:27 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Jiri Slaby, Andrew Morton, ebiederm, linux-kernel, Matthew Wilcox

On 06/02, Roland McGrath wrote:
>
> > > > Heh. In this case you have another (long-standing) issue, please note
> > > > the "if (p->flags & PF_EXITING)" check in wants_signal().
>
> Hmm.  wants_signal():
>
> 	if (p->flags & PF_EXITING)
> 		return 0;
> 	if (sig == SIGKILL)
> 		return 1;
>
> Perhaps we should reverse the order of those two?

Yes perhaps. But afaics this is not enough.

First of all, we should decide what we really want wrt exiting process/thread
&& signals. (see also the end of message).

Let's suppose the killed/exiting process hangs somewhere in close_files(),
and the user wants to SIGKILL via kill(1).

If this process is multithreaded, how can we find the right thread to
wake up? Or we should assume the user should find the offending thread
and use tkill() ? In that case, what if this thread still has the pending
private SIGKILL ?

Of course, the same problem with the shared SIGKILL pending, it is never
dequeued so the next group-wide SIGKILL has no effect.

> But also I'm now reminded that complete_signal() short-circuits for the
> single-threaded case and never does the sig_fatal() case.
>
> This means a single-threaded process will have SIGKILL in shared_pending
> but not in its own pending so __fatal_signal_pending() will be false, no?

Hmm, afaics no. Or I misunderstood. Or I missed something.

Yes, it is possible that we add SIGKILL in shared_pending and do not add
it in ->pending, but this can only happen if all threads have PF_EXITING.
(so "single-threaded" above doesn't matter).

> I'm also now wondering if in some of our recent signals discussions we have
> been assuming that SIGNAL_GROUP_EXIT is set when a fatal signal is pending.

Yes. SIGNAL_GROUP_EXIT == all threads have the pending private SIGKILL.
Except, in do_exit() path, it can be already dequeued.

> > We can clear TIF_SIGPENDING, and we can change recalc_sigpending_xxx()
> > to take PF_EXITING into account (or change their callers), but this
> > needs changes. And I am not sure this will right.
>
> I think we want recalc_sigpending_tsk to be consistent with wants_signal
> and the other conditions controlling signal_wake_up calls.

Well, perhaps. But let's look from the different angle. IF the task was
already SIGKILL'ed, it looks a bit insane we need another SIGKILL to
really kill it if it hangs in do_exit().

Perhaps we need another flag, SIGNAL_GROUP_KILLED or whatever which is
set along with SIGNAL_GROUP_EXIT by complete_signal() when the task is
killed. It is not set by zap_other_threads/etc.

Now, exit_signals() should do something like

	if (SIGNAL_GROUP_KILLED) {
		// make sure interruptible/killable sleep is not
		// possible, we are already killed
		set_thread_flag(TIF_SIGPENDING);
	} else {
		// OK, we still respect SIGKILL
		clear_thread_flag(TIF_SIGPENDING);
	}

Of course we need other changes. complete_signal() should check
SIGNAL_GROUP_KILLED, not SIGNAL_GROUP_EXIT, and wake up all threads.
recalc_sigpending_tsk() needs changes, __fatal_signal_pending()
should be consistent with SIGNAL_GROUP_KILLED on exiting, etc.

Note also complete_signal() does signal_wake_up(t, sig == SIGKILL)
even if SIGNAL_GROUP_EXIT, we should be carefull.

> But indeed we
> need to think through any ramifications carefully.

Agreed. And yes, this is connected to the coredump discussion.

Oleg.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-06-04  2:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-24 20:47 [PATCH 1/1] signal: make group kill signal fatal Jiri Slaby
2009-05-25  0:07 ` Oleg Nesterov
2009-05-25 16:21   ` Jiri Slaby
2009-05-25 17:20     ` Oleg Nesterov
2009-05-25 18:15       ` Jiri Slaby
2009-05-25 22:51         ` Oleg Nesterov
2009-06-02 12:54           ` Jiri Slaby
2009-06-02 14:50             ` Oleg Nesterov
2009-06-03  1:52               ` Roland McGrath
2009-06-04  2:27                 ` Oleg Nesterov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox