* [PATCH] Additional/catchup RCU signal fixes for -mm
@ 2005-11-05 1:36 Paul E. McKenney
2005-11-05 16:32 ` Oleg Nesterov
0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2005-11-05 1:36 UTC (permalink / raw)
To: akpm; +Cc: linux-kernel, dipankar, mingo, suzannew, oleg
Hello!
Additional RCU signal fixes to cover a race with exec() and to add
some rcu_dereference()s from earlier patches.
Signed-off-by: <paulmck@us.ibm.com>
---
signal.c | 28 +++++++++++++++-------------
1 files changed, 15 insertions(+), 13 deletions(-)
diff -urpNa -X dontdiff linux-2.6.14-mm0/kernel/signal.c linux-2.6.14-mm0-fix/kernel/signal.c
--- linux-2.6.14-mm0/kernel/signal.c 2005-11-04 14:37:18.000000000 -0800
+++ linux-2.6.14-mm0-fix/kernel/signal.c 2005-11-04 17:23:40.000000000 -0800
@@ -337,7 +337,7 @@ void exit_sighand(struct task_struct *ts
write_lock_irq(&tasklist_lock);
rcu_read_lock();
if (tsk->sighand != NULL) {
- struct sighand_struct *sighand = tsk->sighand;
+ struct sighand_struct *sighand = rcu_dereference(tsk->sighand);
spin_lock(&sighand->siglock);
__exit_sighand(tsk);
spin_unlock(&sighand->siglock);
@@ -352,13 +352,14 @@ void exit_sighand(struct task_struct *ts
void __exit_signal(struct task_struct *tsk)
{
struct signal_struct * sig = tsk->signal;
- struct sighand_struct * sighand = tsk->sighand;
+ struct sighand_struct * sighand;
if (!sig)
BUG();
if (!atomic_read(&sig->count))
BUG();
rcu_read_lock();
+ sighand = rcu_dereference(tsk->sighand);
spin_lock(&sighand->siglock);
posix_cpu_timers_exit(tsk);
if (atomic_dec_and_test(&sig->count)) {
@@ -1100,7 +1101,7 @@ void zap_other_threads(struct task_struc
}
/*
- * Must be called with the tasklist_lock held for reading!
+ * Must be called under rcu_read_lock() or with tasklist_lock read-held.
*/
int group_send_sig_info(int sig, struct siginfo *info, struct task_struct *p)
{
@@ -1386,7 +1387,7 @@ send_sigqueue(int sig, struct sigqueue *
{
unsigned long flags;
int ret = 0;
- struct sighand_struct *sh = p->sighand;
+ struct sighand_struct *sh;
BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
@@ -1405,7 +1406,15 @@ send_sigqueue(int sig, struct sigqueue *
goto out_err;
}
+retry:
+ sh = rcu_dereference(p->sighand);
+
spin_lock_irqsave(&sh->siglock, flags);
+ if (p->sighand != sh) {
+ /* We raced with exec() in a multithreaded process... */
+ spin_unlock_irqrestore(&sh->siglock, flags);
+ goto retry;
+ }
/*
* We do the check here again to handle the following scenario:
@@ -1464,15 +1473,8 @@ send_group_sigqueue(int sig, struct sigq
BUG_ON(!(q->flags & SIGQUEUE_PREALLOC));
- while (!read_trylock(&tasklist_lock)) {
- if (!p->sighand)
- return -1;
- cpu_relax();
- }
- if (unlikely(!p->sighand)) {
- ret = -1;
- goto out_err;
- }
+ read_lock(&tasklist_lock);
+ /* Since it_lock is held, p->sighand cannot be NULL. */
spin_lock_irqsave(&p->sighand->siglock, flags);
handle_stop_signal(sig, p);
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [PATCH] Additional/catchup RCU signal fixes for -mm 2005-11-05 1:36 [PATCH] Additional/catchup RCU signal fixes for -mm Paul E. McKenney @ 2005-11-05 16:32 ` Oleg Nesterov 2005-11-06 1:00 ` Paul E. McKenney 0 siblings, 1 reply; 16+ messages in thread From: Oleg Nesterov @ 2005-11-05 16:32 UTC (permalink / raw) To: paulmck; +Cc: akpm, linux-kernel, dipankar, mingo, suzannew "Paul E. McKenney" wrote: > > @@ -1386,7 +1387,7 @@ send_sigqueue(int sig, struct sigqueue * > { > unsigned long flags; > int ret = 0; > - struct sighand_struct *sh = p->sighand; > + struct sighand_struct *sh; > > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > > @@ -1405,7 +1406,15 @@ send_sigqueue(int sig, struct sigqueue * > goto out_err; > } > > +retry: > + sh = rcu_dereference(p->sighand); > + > spin_lock_irqsave(&sh->siglock, flags); > + if (p->sighand != sh) { > + /* We raced with exec() in a multithreaded process... */ > + spin_unlock_irqrestore(&sh->siglock, flags); > + goto retry; p->sighand can't be changed, de_thread calls exit_itimers() before changing ->sighand. But I still think it can be NULL, and send_sigqueue() should return -1 in that case. > @@ -1464,15 +1473,8 @@ send_group_sigqueue(int sig, struct sigq > > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > > - while (!read_trylock(&tasklist_lock)) { > - if (!p->sighand) > - return -1; > - cpu_relax(); > - } > - if (unlikely(!p->sighand)) { > - ret = -1; > - goto out_err; > - } > + read_lock(&tasklist_lock); > + /* Since it_lock is held, p->sighand cannot be NULL. */ > spin_lock_irqsave(&p->sighand->siglock, flags); Again, I think the comment is wrong. However, now I think we really have a race with exec, and this race was not introduced by your patches! If !thread_group_leader() does exec de_thread() calls release_task(->group_leader) before calling exit_itimers(). This means that send_group_sigqueue() which always has p == ->group_leader parameter can oops here. Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Additional/catchup RCU signal fixes for -mm 2005-11-05 16:32 ` Oleg Nesterov @ 2005-11-06 1:00 ` Paul E. McKenney 2005-11-06 14:17 ` Oleg Nesterov 2005-11-06 14:32 ` Posix timers vs exec problems Oleg Nesterov 0 siblings, 2 replies; 16+ messages in thread From: Paul E. McKenney @ 2005-11-06 1:00 UTC (permalink / raw) To: Oleg Nesterov; +Cc: akpm, linux-kernel, dipankar, mingo, suzannew On Sat, Nov 05, 2005 at 07:32:47PM +0300, Oleg Nesterov wrote: > "Paul E. McKenney" wrote: > > > > @@ -1386,7 +1387,7 @@ send_sigqueue(int sig, struct sigqueue * > > { > > unsigned long flags; > > int ret = 0; > > - struct sighand_struct *sh = p->sighand; > > + struct sighand_struct *sh; > > > > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > > > > @@ -1405,7 +1406,15 @@ send_sigqueue(int sig, struct sigqueue * > > goto out_err; > > } > > > > +retry: > > + sh = rcu_dereference(p->sighand); > > + > > spin_lock_irqsave(&sh->siglock, flags); > > + if (p->sighand != sh) { > > + /* We raced with exec() in a multithreaded process... */ > > + spin_unlock_irqrestore(&sh->siglock, flags); > > + goto retry; > > p->sighand can't be changed, de_thread calls exit_itimers() before > changing ->sighand. But I still think it can be NULL, and send_sigqueue() > should return -1 in that case. OK, I put the NULL check in with my previous patch. And you are absolutely right in the de_thread() case. I need to add more cases to steamroller... > > @@ -1464,15 +1473,8 @@ send_group_sigqueue(int sig, struct sigq > > > > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > > > > - while (!read_trylock(&tasklist_lock)) { > > - if (!p->sighand) > > - return -1; > > - cpu_relax(); > > - } > > - if (unlikely(!p->sighand)) { > > - ret = -1; > > - goto out_err; > > - } > > + read_lock(&tasklist_lock); > > + /* Since it_lock is held, p->sighand cannot be NULL. */ > > spin_lock_irqsave(&p->sighand->siglock, flags); > > Again, I think the comment is wrong. > > However, now I think we really have a race with exec, and this race was not > introduced by your patches! This patch was not mine, though I guess that it is by now. ;-) > If !thread_group_leader() does exec de_thread() calls release_task(->group_leader) > before calling exit_itimers(). This means that send_group_sigqueue() which > always has p == ->group_leader parameter can oops here. But in that case, __exit_sighand(->group_leader) would have been called, so ->sighand would be NULL. And none of this can change while we are holding tasklist_lock. If we don't want to be hitting the exec()ed task with a signal, the thing to do would be to drop the signal, as in the attached patch. I believe that this is an acceptable approach, since had the timer fired slightly later, it would have been disabled, right? Thoughts? Thanx, Paul Signed-off-by: <paulmck@us.ibm.com> diff -urpNa -X dontdiff linux-2.6.14-mm0-fix-2/kernel/signal.c linux-2.6.14-mm0-fix-3/kernel/signal.c --- linux-2.6.14-mm0-fix-2/kernel/signal.c 2005-11-05 15:05:38.000000000 -0800 +++ linux-2.6.14-mm0-fix-3/kernel/signal.c 2005-11-05 16:27:52.000000000 -0800 @@ -1481,6 +1481,10 @@ send_group_sigqueue(int sig, struct sigq read_lock(&tasklist_lock); while (p->group_leader != p) p = p->group_leader; + if (p->sighand == NULL) { + ret = 1; + goto out_err; + } spin_lock_irqsave(&p->sighand->siglock, flags); handle_stop_signal(sig, p); ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Additional/catchup RCU signal fixes for -mm 2005-11-06 1:00 ` Paul E. McKenney @ 2005-11-06 14:17 ` Oleg Nesterov 2005-11-06 14:46 ` Oleg Nesterov 2005-11-06 14:32 ` Posix timers vs exec problems Oleg Nesterov 1 sibling, 1 reply; 16+ messages in thread From: Oleg Nesterov @ 2005-11-06 14:17 UTC (permalink / raw) To: paulmck; +Cc: akpm, linux-kernel, dipankar, mingo, suzannew "Paul E. McKenney" wrote: > > > If !thread_group_leader() does exec de_thread() calls release_task(->group_leader) > > before calling exit_itimers(). This means that send_group_sigqueue() which > > always has p == ->group_leader parameter can oops here. > > But in that case, __exit_sighand(->group_leader) would have been called, > so ->sighand would be NULL. Yes, that is why (I think) oops can happen. > And none of this can change while we are holding > tasklist_lock. Yes, but de_thread()->release_task(->group_leader) can take tasklist_lock before us. > If we don't want to be hitting the exec()ed task with a signal, the > thing to do would be to drop the signal, as in the attached patch. > I believe that this is an acceptable approach, since had the timer > fired slightly later, it would have been disabled, right? > > Thoughts? > > Thanx, Paul > > Signed-off-by: <paulmck@us.ibm.com> > > diff -urpNa -X dontdiff linux-2.6.14-mm0-fix-2/kernel/signal.c linux-2.6.14-mm0-fix-3/kernel/signal.c > --- linux-2.6.14-mm0-fix-2/kernel/signal.c 2005-11-05 15:05:38.000000000 -0800 > +++ linux-2.6.14-mm0-fix-3/kernel/signal.c 2005-11-05 16:27:52.000000000 -0800 > @@ -1481,6 +1481,10 @@ send_group_sigqueue(int sig, struct sigq > read_lock(&tasklist_lock); > while (p->group_leader != p) > p = p->group_leader; > + if (p->sighand == NULL) { > + ret = 1; Oh, I think there is another problem here. I'll post a separate message. Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Additional/catchup RCU signal fixes for -mm 2005-11-06 14:17 ` Oleg Nesterov @ 2005-11-06 14:46 ` Oleg Nesterov 2005-11-06 23:02 ` Paul E. McKenney 0 siblings, 1 reply; 16+ messages in thread From: Oleg Nesterov @ 2005-11-06 14:46 UTC (permalink / raw) To: paulmck, akpm, linux-kernel, dipankar, mingo, suzannew Oleg Nesterov wrote: > > "Paul E. McKenney" wrote: > > > > + if (p->sighand == NULL) { > > + ret = 1; > > Oh, I think there is another problem here. I'll post a separate > message. Sorry, I was not clear. This problem is unrelated. Yes, I think we should drop the signal. But please note that ret = 1 (sig_ignored) means (surprise!) "reschedule and re-arm this timer right now" in cpu-timer case. It is not critical, but may be ret = 0 is better. Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Additional/catchup RCU signal fixes for -mm 2005-11-06 14:46 ` Oleg Nesterov @ 2005-11-06 23:02 ` Paul E. McKenney 0 siblings, 0 replies; 16+ messages in thread From: Paul E. McKenney @ 2005-11-06 23:02 UTC (permalink / raw) To: Oleg Nesterov; +Cc: akpm, linux-kernel, dipankar, mingo, suzannew On Sun, Nov 06, 2005 at 05:46:09PM +0300, Oleg Nesterov wrote: > Oleg Nesterov wrote: > > > > "Paul E. McKenney" wrote: > > > > > > + if (p->sighand == NULL) { > > > + ret = 1; > > > > Oh, I think there is another problem here. I'll post a separate > > message. > > Sorry, I was not clear. This problem is unrelated. Yes, I think we > should drop the signal. But please note that ret = 1 (sig_ignored) > means (surprise!) "reschedule and re-arm this timer right now" in > cpu-timer case. It is not critical, but may be ret = 0 is better. OK. Seems like the next firing of the timer would then see the changed situation, so the current code should at least be safe. Thoughts? Thanx, Paul ^ permalink raw reply [flat|nested] 16+ messages in thread
* Posix timers vs exec problems 2005-11-06 1:00 ` Paul E. McKenney 2005-11-06 14:17 ` Oleg Nesterov @ 2005-11-06 14:32 ` Oleg Nesterov 2005-11-07 18:12 ` [PATCH] fix de_thread() vs send_group_sigqueue() race Oleg Nesterov 2005-11-16 23:26 ` [PATCH] sigaction should clear all signals on SIG_IGN, not just < 32 George Anzinger 1 sibling, 2 replies; 16+ messages in thread From: Oleg Nesterov @ 2005-11-06 14:32 UTC (permalink / raw) To: paulmck, Roland McGrath, George Anzinger Cc: akpm, linux-kernel, dipankar, mingo, suzannew Roland, George, I think posix timers have problems with de_thread(). First, when non-leader thread does exec it calls release_task(->group_leader) before calling exit_itimers(). This means that send_group_sigqueue() can oops after taking tasklist_lock while doing spin_lock_irqsave(->sighand->siglock). This is easy to fix. cpu-timers have another problem. In !CPUCLOCK_PERTHREAD() case we attach the timer to ->signal->cpu_timers, so these timers (when created by another process) will survive after exec. However they will continue to profile execed process only if it was group_leader who did exec, otherwise posix_cpu_timer_schedule() will notice that timer->it.cpu.task has ->signal == NULL and stop this timer. So, should de_thread() call posix_cpu_timers_exit_group() after exit_itimers() thus stoping cpu-timers after exec? Another option is to change ->it.cpu.task for each timer in ->signal->cpu_timers, this way cpu-timers will continue to work. But this is not trivial. Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] fix de_thread() vs send_group_sigqueue() race 2005-11-06 14:32 ` Posix timers vs exec problems Oleg Nesterov @ 2005-11-07 18:12 ` Oleg Nesterov 2005-11-08 20:36 ` Chris Wright 2005-11-16 23:26 ` [PATCH] sigaction should clear all signals on SIG_IGN, not just < 32 George Anzinger 1 sibling, 1 reply; 16+ messages in thread From: Oleg Nesterov @ 2005-11-07 18:12 UTC (permalink / raw) To: paulmck, Roland McGrath, George Anzinger, akpm, linux-kernel, dipankar, mingo, suzannew, Chris Wright When non-leader thread does exec, de_thread calls release_task(leader) before calling exit_itimers(). If local timer interrupt happens in between, it can oops in send_group_sigqueue() while taking ->sighand->siglock == NULL. However, we can't change send_group_sigqueue() to check p->signal != NULL, because sys_timer_create() does get_task_struct() only in SIGEV_THREAD_ID case. So it is possible that this task_struct was already freed and we can't trust p->signal. This patch changes de_thread() so that leader released after exit_itimers() call. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> --- 2.6.14/fs/exec.c~ 2005-09-21 21:08:33.000000000 +0400 +++ 2.6.14/fs/exec.c 2005-11-07 23:54:42.000000000 +0300 @@ -593,6 +593,7 @@ static inline int de_thread(struct task_ struct signal_struct *sig = tsk->signal; struct sighand_struct *newsighand, *oldsighand = tsk->sighand; spinlock_t *lock = &oldsighand->siglock; + struct task_struct *leader = NULL; int count; /* @@ -668,7 +669,7 @@ static inline int de_thread(struct task_ * and to assume its PID: */ if (!thread_group_leader(current)) { - struct task_struct *leader = current->group_leader, *parent; + struct task_struct *parent; struct dentry *proc_dentry1, *proc_dentry2; unsigned long exit_state, ptrace; @@ -677,6 +678,7 @@ static inline int de_thread(struct task_ * It should already be zombie at this point, most * of the time. */ + leader = current->group_leader; while (leader->exit_state != EXIT_ZOMBIE) yield(); @@ -736,7 +738,6 @@ static inline int de_thread(struct task_ proc_pid_flush(proc_dentry2); BUG_ON(exit_state != EXIT_ZOMBIE); - release_task(leader); } /* @@ -746,8 +747,11 @@ static inline int de_thread(struct task_ sig->flags = 0; no_thread_group: - BUG_ON(atomic_read(&sig->count) != 1); exit_itimers(sig); + if (leader) + release_task(leader); + + BUG_ON(atomic_read(&sig->count) != 1); if (atomic_read(&oldsighand->count) == 1) { /* ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] fix de_thread() vs send_group_sigqueue() race 2005-11-07 18:12 ` [PATCH] fix de_thread() vs send_group_sigqueue() race Oleg Nesterov @ 2005-11-08 20:36 ` Chris Wright 2005-11-08 20:55 ` Linus Torvalds 0 siblings, 1 reply; 16+ messages in thread From: Chris Wright @ 2005-11-08 20:36 UTC (permalink / raw) To: Oleg Nesterov Cc: paulmck, Roland McGrath, George Anzinger, akpm, linux-kernel, dipankar, mingo, suzannew, Chris Wright, torvalds * Oleg Nesterov (oleg@tv-sign.ru) wrote: > When non-leader thread does exec, de_thread calls release_task(leader) before > calling exit_itimers(). If local timer interrupt happens in between, it can > oops in send_group_sigqueue() while taking ->sighand->siglock == NULL. > > However, we can't change send_group_sigqueue() to check p->signal != NULL, > because sys_timer_create() does get_task_struct() only in SIGEV_THREAD_ID > case. So it is possible that this task_struct was already freed and we can't > trust p->signal. > > This patch changes de_thread() so that leader released after exit_itimers() > call. Nice catch. As soon as Linus picks it up we'll put it in -stable as well. > Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Acked-by: Chris Wright <chrisw@osdl.org> > --- 2.6.14/fs/exec.c~ 2005-09-21 21:08:33.000000000 +0400 > +++ 2.6.14/fs/exec.c 2005-11-07 23:54:42.000000000 +0300 > @@ -593,6 +593,7 @@ static inline int de_thread(struct task_ > struct signal_struct *sig = tsk->signal; > struct sighand_struct *newsighand, *oldsighand = tsk->sighand; > spinlock_t *lock = &oldsighand->siglock; > + struct task_struct *leader = NULL; > int count; > > /* > @@ -668,7 +669,7 @@ static inline int de_thread(struct task_ > * and to assume its PID: > */ > if (!thread_group_leader(current)) { > - struct task_struct *leader = current->group_leader, *parent; > + struct task_struct *parent; > struct dentry *proc_dentry1, *proc_dentry2; > unsigned long exit_state, ptrace; > > @@ -677,6 +678,7 @@ static inline int de_thread(struct task_ > * It should already be zombie at this point, most > * of the time. > */ > + leader = current->group_leader; > while (leader->exit_state != EXIT_ZOMBIE) > yield(); > > @@ -736,7 +738,6 @@ static inline int de_thread(struct task_ > proc_pid_flush(proc_dentry2); > > BUG_ON(exit_state != EXIT_ZOMBIE); > - release_task(leader); > } > > /* > @@ -746,8 +747,11 @@ static inline int de_thread(struct task_ > sig->flags = 0; > > no_thread_group: > - BUG_ON(atomic_read(&sig->count) != 1); > exit_itimers(sig); > + if (leader) > + release_task(leader); > + > + BUG_ON(atomic_read(&sig->count) != 1); > > if (atomic_read(&oldsighand->count) == 1) { > /* ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] fix de_thread() vs send_group_sigqueue() race 2005-11-08 20:36 ` Chris Wright @ 2005-11-08 20:55 ` Linus Torvalds 0 siblings, 0 replies; 16+ messages in thread From: Linus Torvalds @ 2005-11-08 20:55 UTC (permalink / raw) To: Chris Wright Cc: Oleg Nesterov, paulmck, Roland McGrath, George Anzinger, akpm, linux-kernel, dipankar, mingo, suzannew On Tue, 8 Nov 2005, Chris Wright wrote: > * Oleg Nesterov (oleg@tv-sign.ru) wrote: > > When non-leader thread does exec, de_thread calls release_task(leader) before > > calling exit_itimers(). If local timer interrupt happens in between, it can > > oops in send_group_sigqueue() while taking ->sighand->siglock == NULL. > > > > However, we can't change send_group_sigqueue() to check p->signal != NULL, > > because sys_timer_create() does get_task_struct() only in SIGEV_THREAD_ID > > case. So it is possible that this task_struct was already freed and we can't > > trust p->signal. > > > > This patch changes de_thread() so that leader released after exit_itimers() > > call. > > Nice catch. As soon as Linus picks it up we'll put it in -stable as > well. Gaah. For some reason I was pretty much the only one not cc'd on the original patch ;) Found it on linux-kernel. Linus ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] sigaction should clear all signals on SIG_IGN, not just < 32 2005-11-06 14:32 ` Posix timers vs exec problems Oleg Nesterov 2005-11-07 18:12 ` [PATCH] fix de_thread() vs send_group_sigqueue() race Oleg Nesterov @ 2005-11-16 23:26 ` George Anzinger 2005-11-22 1:09 ` Thread group exec race -> null pointer... HELP George Anzinger 1 sibling, 1 reply; 16+ messages in thread From: George Anzinger @ 2005-11-16 23:26 UTC (permalink / raw) To: Oleg Nesterov Cc: paulmck, Roland McGrath, akpm, linux-kernel, dipankar, mingo, suzannew [-- Attachment #1: Type: text/plain, Size: 488 bytes --] While rooting aroung in the signal code trying to understand how to fix the SIG_IGN ploy (set sig handler to SIG_IGN and flood system with high speed repeating timers) I came across what, I think, is a problem in sigaction() in that when processing a SIG_IGN request it flushes signals from 1 to SIGRTMIN and leaves the rest. The attached patch is an attempt to fix this. -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ [-- Attachment #2: sigaction-fix.patch --] [-- Type: text/plain, Size: 3026 bytes --] Source: MontaVista Software, Inc. Type: Defect Fix Description: It appeares that the sigaction system call, when processing a SIG_IGN or a SIG_DFL, is removing only signals that appear in the first mask word. Signed-off-by: George Anzinger <george@mvista.com> include/linux/signal.h | 16 ++++++++++++++++ kernel/signal.c | 34 ++++++++++++++++++++++++++++++++-- 2 files changed, 48 insertions(+), 2 deletions(-) Index: linux-2.6.15-rc/kernel/signal.c =================================================================== --- linux-2.6.15-rc.orig/kernel/signal.c +++ linux-2.6.15-rc/kernel/signal.c @@ -633,6 +633,33 @@ void signal_wake_up(struct task_struct * * Returns 1 if any signals were found. * * All callers must be holding the siglock. + * + * This version takes a sigset mask and looks at all signals, + * not just those in the first mask word. + */ +static int rm_from_queue_full(sigset_t *mask, struct sigpending *s) +{ + struct sigqueue *q, *n; + sigset_t m; + + sigandsets(&m, mask, &s->signal); + if (sigisemptyset(&m)) + return 0; + + signandsets(&s->signal, &s->signal, mask); + list_for_each_entry_safe(q, n, &s->list, list) { + if (sigismember(mask, q->info.si_signo)) { + list_del_init(&q->list); + __sigqueue_free(q); + } + } + return 1; +} +/* + * Remove signals in mask from the pending set and queue. + * Returns 1 if any signals were found. + * + * All callers must be holding the siglock. */ static int rm_from_queue(unsigned long mask, struct sigpending *s) { @@ -2471,6 +2498,7 @@ int do_sigaction(int sig, const struct k_sigaction *act, struct k_sigaction *oact) { struct k_sigaction *k; + sigset_t mask; if (!valid_signal(sig) || sig < 1 || (act && sig_kernel_only(sig))) return -EINVAL; @@ -2518,9 +2546,11 @@ do_sigaction(int sig, const struct k_sig *k = *act; sigdelsetmask(&k->sa.sa_mask, sigmask(SIGKILL) | sigmask(SIGSTOP)); - rm_from_queue(sigmask(sig), &t->signal->shared_pending); + sigemptyset(&mask); + sigaddset(&mask, sig); + rm_from_queue_full(&mask, &t->signal->shared_pending); do { - rm_from_queue(sigmask(sig), &t->pending); + rm_from_queue_full(&mask, &t->pending); recalc_sigpending_tsk(t); t = next_thread(t); } while (t != current); Index: linux-2.6.15-rc/include/linux/signal.h =================================================================== --- linux-2.6.15-rc.orig/include/linux/signal.h +++ linux-2.6.15-rc/include/linux/signal.h @@ -82,6 +82,22 @@ static inline int sigfindinword(unsigned #endif /* __HAVE_ARCH_SIG_BITOPS */ +static inline int sigisemptyset(sigset_t *set) +{ + extern void _NSIG_WORDS_is_unsupported_size(void); + switch (_NSIG_WORDS) { + case 4: + return (set->sig[3] | set->sig[2] | + set->sig[1] | set->sig[0]) == 0; + case 2: + return (set->sig[1] | set->sig[0]) == 0; + case 1: + return set->sig[0] == 0; + default: + _NSIG_WORDS_is_unsupported_size(); + } +} + #define sigmask(sig) (1UL << ((sig) - 1)) #ifndef __HAVE_ARCH_SIG_SETOPS ^ permalink raw reply [flat|nested] 16+ messages in thread
* Thread group exec race -> null pointer... HELP 2005-11-16 23:26 ` [PATCH] sigaction should clear all signals on SIG_IGN, not just < 32 George Anzinger @ 2005-11-22 1:09 ` George Anzinger 2005-11-22 14:45 ` Oleg Nesterov 2005-11-22 19:20 ` [PATCH] fix do_wait() vs exec() race Oleg Nesterov 0 siblings, 2 replies; 16+ messages in thread From: George Anzinger @ 2005-11-22 1:09 UTC (permalink / raw) To: george Cc: Oleg Nesterov, paulmck, Roland McGrath, akpm, linux-kernel, dipankar, mingo, suzannew [-- Attachment #1: Type: text/plain, Size: 2163 bytes --] George Anzinger wrote: > While rooting aroung in the signal code trying to understand how to > fix the SIG_IGN ploy (set sig handler to SIG_IGN and flood system with > high speed repeating timers) I came across what, I think, is a problem > in sigaction() in that when processing a SIG_IGN request it flushes > signals from 1 to SIGRTMIN and leaves the rest. Still rooting around in the above. The test program is attached. It creates and arms a repeating timer and then clones a thread which does an exec() call. If I run the test with top (only two programs running) I quickly get an OOPS on trying to derefence a NULL pointer. It is comming from a call the posix timer code is making to deliver a timer. Call is to send_group_sigqueue() at ~445 in posix-timers.c. The process being passed in is DEAD with current->signal ==NULL, thus the OOPS. In the first instance of this, we see that the thread-group leader is dead and the exec code at line ~718 is setting the old leaders group-leader to him self. The failure then happens when the IRQ release is done on the write_unlock_irq() at ~732 thus allowing the timer interrupt. Thinking that it makes no real sense to set the group leader to a dead process, I did the following: --- linux-2.6.15-rc.orig/fs/exec.c +++ linux-2.6.15-rc/fs/exec.c @@ -715,7 +715,7 @@ static inline int de_thread(struct task_ current->parent = current->real_parent = leader->real_parent; leader->parent = leader->real_parent = child_reaper; current->group_leader = current; - leader->group_leader = leader; + leader->group_leader = current; add_parent(current, current->parent); add_parent(leader, leader->parent); This also fails as there is still a window where the group leader is dead with a null signal pointer, i.e. the interrupt happens (this time on another cpu) before the above changed code is executed. It seems to me that the group leader needs to change prior to setting the signal pointer to NULL, but I don't really know this code very well. Help ! -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ [-- Attachment #2: timer_kill.c --] [-- Type: text/x-csrc, Size: 1193 bytes --] #include <errno.h> #include <stdio.h> #include <signal.h> #include <string.h> #include <stdlib.h> #include <unistd.h> #include <pthread.h> #include <sys/wait.h> #include <time.h> void die(const char* msg) { fprintf(stderr, "ERR!! %s: %s\n", msg, strerror(errno)); exit(-1); } char thread_stack[4096]; int thread_func(void *arg) { execl("/bin/true", NULL); die("exec"); return 0; } void proc_func(void) { int pid; for (;;) if ((pid = fork())) { if (pid != waitpid(pid, NULL, 0)) die("wait4"); } else { struct sigevent sigev = {}; struct itimerspec itsp = {}; timer_t tid; sigev.sigev_signo = SIGRTMIN; sigev.sigev_notify = SIGEV_SIGNAL; if (timer_create(CLOCK_MONOTONIC, &sigev, &tid) == -1) die("timer_create"); itsp.it_value. tv_nsec = 1; itsp.it_interval. tv_nsec = 1; if (timer_settime(tid, 0, &itsp, NULL)) die("timer_settime"); if (clone(thread_func, thread_stack + 2048, CLONE_THREAD|CLONE_SIGHAND|CLONE_VM|CLONE_FILES, NULL) < 0) die("clone"); pause(); } } int main(void) { int pn; signal(SIGRTMIN, SIG_IGN); for (pn = 0; pn < 16; ++pn) if (!fork()) proc_func(); pause(); return 0; } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Thread group exec race -> null pointer... HELP 2005-11-22 1:09 ` Thread group exec race -> null pointer... HELP George Anzinger @ 2005-11-22 14:45 ` Oleg Nesterov 2005-11-23 20:30 ` George Anzinger 2005-11-22 19:20 ` [PATCH] fix do_wait() vs exec() race Oleg Nesterov 1 sibling, 1 reply; 16+ messages in thread From: Oleg Nesterov @ 2005-11-22 14:45 UTC (permalink / raw) To: george Cc: paulmck, Roland McGrath, akpm, linux-kernel, dipankar, mingo, suzannew George Anzinger wrote: > > Still rooting around in the above. The test program is attached. It > creates and arms a repeating timer and then clones a thread which does > an exec() call. This patch: http://marc.theaimsgroup.com/?l=linux-kernel&m=113138286512847 was intended to fix exactly this problem (and the same test program was used to exploit the race and test the fix). So, it does not help? I can't reproduce the problem. Note: I think you also need this patch: http://marc.theaimsgroup.com/?l=linux-kernel&m=113059955626598 otherwise I beleive OOPS can happen while killing this program if you are running the kernel with this change applied: [PATCH] Call exit_itimers from do_exit, not __exit_signal http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=25f407f0b668f5e4ebd5d13e1fb4306ba6427ead > first instance of this, we see that the thread-group leader is dead > and the exec code at line ~718 is setting the old leaders group-leader > to him self. I think this code at line ~718 leader->group_leader = leader; is noop, because leader->group_leader == leader here. > - leader->group_leader = leader; > + leader->group_leader = current; This can't help, without SIGEV_THREAD_ID we don't check ->group_leader, the signal goes to the thread group via timer->it_process, which is equal to the old leader. Oleg. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Thread group exec race -> null pointer... HELP 2005-11-22 14:45 ` Oleg Nesterov @ 2005-11-23 20:30 ` George Anzinger 2005-11-25 15:03 ` Ingo Molnar 0 siblings, 1 reply; 16+ messages in thread From: George Anzinger @ 2005-11-23 20:30 UTC (permalink / raw) To: Oleg Nesterov Cc: paulmck, Roland McGrath, akpm, linux-kernel, dipankar, mingo, suzannew Oleg Nesterov wrote: > George Anzinger wrote: > >>Still rooting around in the above. The test program is attached. It >>creates and arms a repeating timer and then clones a thread which does >>an exec() call. > > > This patch: > > http://marc.theaimsgroup.com/?l=linux-kernel&m=113138286512847 > > was intended to fix exactly this problem (and the same test program was > used to exploit the race and test the fix). > > So, it does not help? I can't reproduce the problem. Yes, it does fix it. Somehow I missed the posting of that patch. > > Note: I think you also need this patch: > > http://marc.theaimsgroup.com/?l=linux-kernel&m=113059955626598 > > otherwise I beleive OOPS can happen while killing this program if you are > running the kernel with this change applied: > > [PATCH] Call exit_itimers from do_exit, not __exit_signal > http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=25f407f0b668f5e4ebd5d13e1fb4306ba6427ead > > > >>first instance of this, we see that the thread-group leader is dead >>and the exec code at line ~718 is setting the old leaders group-leader >>to him self. > > > I think this code at line ~718 > > leader->group_leader = leader; > > is noop, because leader->group_leader == leader here. > > >>- leader->group_leader = leader; >>+ leader->group_leader = current; > > > This can't help, without SIGEV_THREAD_ID we don't check ->group_leader, > the signal goes to the thread group via timer->it_process, which is equal > to the old leader. The signal code returns <0 so posix-timers digs into up the group_leader and trys again. Still, the patch fixes it all. -- George Anzinger george@mvista.com HRT (High-res-timers): http://sourceforge.net/projects/high-res-timers/ ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Thread group exec race -> null pointer... HELP 2005-11-23 20:30 ` George Anzinger @ 2005-11-25 15:03 ` Ingo Molnar 0 siblings, 0 replies; 16+ messages in thread From: Ingo Molnar @ 2005-11-25 15:03 UTC (permalink / raw) To: George Anzinger Cc: Oleg Nesterov, paulmck, Roland McGrath, akpm, linux-kernel, dipankar, suzannew * George Anzinger <george@mvista.com> wrote: > >This patch: > > > > http://marc.theaimsgroup.com/?l=linux-kernel&m=113138286512847 > > > >was intended to fix exactly this problem (and the same test program was > >used to exploit the race and test the fix). > > > >So, it does not help? I can't reproduce the problem. > > Yes, it does fix it. Somehow I missed the posting of that patch. fyi, i've included the patch in -rt15. Ingo ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] fix do_wait() vs exec() race 2005-11-22 1:09 ` Thread group exec race -> null pointer... HELP George Anzinger 2005-11-22 14:45 ` Oleg Nesterov @ 2005-11-22 19:20 ` Oleg Nesterov 1 sibling, 0 replies; 16+ messages in thread From: Oleg Nesterov @ 2005-11-22 19:20 UTC (permalink / raw) To: linux-kernel Cc: george, paulmck, Roland McGrath, akpm, dipankar, mingo, Linus Torvalds, Chris Wright When non-leader thread does exec, de_thread adds old leader to the init's ->children list in EXIT_ZOMBIE state and drops tasklist_lock. This means that release_task(leader) in de_thread() is racy vs do_wait() from init task. I think de_thread() should set old leader's state to EXIT_DEAD instead. Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> --- 2.6.15-rc2/fs/exec.c~ 2005-11-22 19:35:31.000000000 +0300 +++ 2.6.15-rc2/fs/exec.c 2005-11-23 00:49:23.000000000 +0300 @@ -668,7 +668,7 @@ static inline int de_thread(struct task_ if (!thread_group_leader(current)) { struct task_struct *parent; struct dentry *proc_dentry1, *proc_dentry2; - unsigned long exit_state, ptrace; + unsigned long ptrace; /* * Wait for the thread group leader to be a zombie. @@ -726,15 +726,15 @@ static inline int de_thread(struct task_ list_del(¤t->tasks); list_add_tail(¤t->tasks, &init_task.tasks); current->exit_signal = SIGCHLD; - exit_state = leader->exit_state; + + BUG_ON(leader->exit_state != EXIT_ZOMBIE); + leader->exit_state = EXIT_DEAD; write_unlock_irq(&tasklist_lock); spin_unlock(&leader->proc_lock); spin_unlock(¤t->proc_lock); proc_pid_flush(proc_dentry1); proc_pid_flush(proc_dentry2); - - BUG_ON(exit_state != EXIT_ZOMBIE); } /* ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2005-11-25 15:04 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-11-05 1:36 [PATCH] Additional/catchup RCU signal fixes for -mm Paul E. McKenney 2005-11-05 16:32 ` Oleg Nesterov 2005-11-06 1:00 ` Paul E. McKenney 2005-11-06 14:17 ` Oleg Nesterov 2005-11-06 14:46 ` Oleg Nesterov 2005-11-06 23:02 ` Paul E. McKenney 2005-11-06 14:32 ` Posix timers vs exec problems Oleg Nesterov 2005-11-07 18:12 ` [PATCH] fix de_thread() vs send_group_sigqueue() race Oleg Nesterov 2005-11-08 20:36 ` Chris Wright 2005-11-08 20:55 ` Linus Torvalds 2005-11-16 23:26 ` [PATCH] sigaction should clear all signals on SIG_IGN, not just < 32 George Anzinger 2005-11-22 1:09 ` Thread group exec race -> null pointer... HELP George Anzinger 2005-11-22 14:45 ` Oleg Nesterov 2005-11-23 20:30 ` George Anzinger 2005-11-25 15:03 ` Ingo Molnar 2005-11-22 19:20 ` [PATCH] fix do_wait() vs exec() race Oleg Nesterov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox