* [PATCH] fix race copy_process() vs de_thread()
@ 2009-08-24 4:01 Hiroshi Shimamoto
2009-08-24 5:11 ` KAMEZAWA Hiroyuki
2009-08-24 6:14 ` [PATCH] " Roland McGrath
0 siblings, 2 replies; 13+ messages in thread
From: Hiroshi Shimamoto @ 2009-08-24 4:01 UTC (permalink / raw)
To: Andrew Morton, Oleg Nesterov, Roland McGrath, linux-kernel
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
There is a race between de_thread() and copy_process().
When a thread is during execve and another thread is during clone, exec-ing
thread may be hung up in de_thread() waiting other threads are finished.
The root cause is that cleanup_signal() which is called when fork() failed
doesn't cause wake up the waiting thread at de_thread(), because there is no
check signal->count.
We need the check signal->group_exit_task and signal->notify_count.
Here is a reproducer, it may generate a thread which never die.
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <pthread.h>
void *null_thread(void *p)
{
for (;;)
sleep(1);
return NULL;
}
void *exec_thread(void *p)
{
execl("/bin/true", "/bin/true", NULL);
return null_thread(p);
}
int main(int argc, char **argv)
{
for (;;) {
pid_t pid;
int ret, status;
pid = fork();
if (pid < 0)
break;
if (!pid) {
pthread_t tid;
pthread_create(&tid, NULL, exec_thread, NULL);
for (;;)
pthread_create(&tid, NULL, null_thread, NULL);
}
do {
ret = waitpid(pid, &status, 0);
} while (ret == -1 && errno == EINTR);
}
return 0;
}
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
---
kernel/fork.c | 12 ++++++++++++
1 files changed, 12 insertions(+), 0 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 3ffa10f..be6c5b5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -882,6 +882,9 @@ static void cleanup_signal(struct task_struct *tsk)
{
struct signal_struct *sig = tsk->signal;
+ if (!sig)
+ return;
+
atomic_dec(&sig->live);
if (atomic_dec_and_test(&sig->count))
@@ -1230,6 +1233,15 @@ static struct task_struct *copy_process(unsigned long clone_flags,
*/
recalc_sigpending();
if (signal_pending(current)) {
+ /* If there is any task waiting for the group exit, notify it */
+ if ((clone_flags & CLONE_THREAD) &&
+ p->signal->group_exit_task) {
+ atomic_dec(&p->signal->live);
+ atomic_dec(&p->signal->count);
+ if (atomic_read(&p->signal->count) == p->signal->notify_count)
+ wake_up_process(p->signal->group_exit_task);
+ p->signal = NULL;
+ }
spin_unlock(¤t->sighand->siglock);
write_unlock_irq(&tasklist_lock);
retval = -ERESTARTNOINTR;
--
1.6.3.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] fix race copy_process() vs de_thread()
2009-08-24 4:01 [PATCH] fix race copy_process() vs de_thread() Hiroshi Shimamoto
@ 2009-08-24 5:11 ` KAMEZAWA Hiroyuki
2009-08-24 5:58 ` [PATCH v2] " Hiroshi Shimamoto
2009-08-24 6:14 ` [PATCH] " Roland McGrath
1 sibling, 1 reply; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-08-24 5:11 UTC (permalink / raw)
To: Hiroshi Shimamoto
Cc: Andrew Morton, Oleg Nesterov, Roland McGrath, linux-kernel
On Mon, 24 Aug 2009 13:01:40 +0900
Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> wrote:
> Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
> ---
> kernel/fork.c | 12 ++++++++++++
> 1 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 3ffa10f..be6c5b5 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -882,6 +882,9 @@ static void cleanup_signal(struct task_struct *tsk)
> {
> struct signal_struct *sig = tsk->signal;
>
> + if (!sig)
> + return;
> +
> atomic_dec(&sig->live);
>
> if (atomic_dec_and_test(&sig->count))
> @@ -1230,6 +1233,15 @@ static struct task_struct *copy_process(unsigned long clone_flags,
> */
> recalc_sigpending();
> if (signal_pending(current)) {
> + /* If there is any task waiting for the group exit, notify it */
> + if ((clone_flags & CLONE_THREAD) &&
> + p->signal->group_exit_task) {
> + atomic_dec(&p->signal->live);
> + atomic_dec(&p->signal->count);
> + if (atomic_read(&p->signal->count) == p->signal->notify_count)
> + wake_up_process(p->signal->group_exit_task);
> + p->signal = NULL;
> + }
> spin_unlock(¤t->sighand->siglock);
> write_unlock_irq(&tasklist_lock);
> retval = -ERESTARTNOINTR;
I'm not sure whehter I catch the issue correctly but, shouldn't we encapsulate
this check in cleanup_signal() ?
like...
static void cleanup_signal(struct task_struct *tsk)
{
struct signal_struct *sig = tsk->signal;
sighand = rcu_dereference(tsk->sighand);
spin_lock(&sighand->siglock);
if (sig->group_exit_task && atomic_read(&sig->count) == sig->notify_count)
wake_up_process(sig->group_exit_task);
spin_unlock(&sighand->siglock);
atomic_dec(&sig->live);
if (!atomic_dec_and_test(&sig->count))
__cleanup_signal(sig);
}
BTW, why sig->xxx members in "signal" struct is guarded by lock in "sighand"
struct ??
Thanks,
-Kame
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2] fix race copy_process() vs de_thread()
2009-08-24 5:11 ` KAMEZAWA Hiroyuki
@ 2009-08-24 5:58 ` Hiroshi Shimamoto
0 siblings, 0 replies; 13+ messages in thread
From: Hiroshi Shimamoto @ 2009-08-24 5:58 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Andrew Morton, Oleg Nesterov, Roland McGrath, linux-kernel
KAMEZAWA Hiroyuki wrote:
> On Mon, 24 Aug 2009 13:01:40 +0900
> Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> wrote:
>
>> Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
>> ---
>> kernel/fork.c | 12 ++++++++++++
>> 1 files changed, 12 insertions(+), 0 deletions(-)
>>
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index 3ffa10f..be6c5b5 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -882,6 +882,9 @@ static void cleanup_signal(struct task_struct *tsk)
>> {
>> struct signal_struct *sig = tsk->signal;
>>
>> + if (!sig)
>> + return;
>> +
>> atomic_dec(&sig->live);
>>
>> if (atomic_dec_and_test(&sig->count))
>> @@ -1230,6 +1233,15 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>> */
>> recalc_sigpending();
>> if (signal_pending(current)) {
>> + /* If there is any task waiting for the group exit, notify it */
>> + if ((clone_flags & CLONE_THREAD) &&
>> + p->signal->group_exit_task) {
>> + atomic_dec(&p->signal->live);
>> + atomic_dec(&p->signal->count);
>> + if (atomic_read(&p->signal->count) == p->signal->notify_count)
>> + wake_up_process(p->signal->group_exit_task);
>> + p->signal = NULL;
>> + }
>> spin_unlock(¤t->sighand->siglock);
>> write_unlock_irq(&tasklist_lock);
>> retval = -ERESTARTNOINTR;
>
> I'm not sure whehter I catch the issue correctly but, shouldn't we encapsulate
> this check in cleanup_signal() ?
Ah, you're right. clone_process() may fail several reasons and failures after
copy_signal() will cause this issue.
>
> like...
>
> static void cleanup_signal(struct task_struct *tsk)
> {
> struct signal_struct *sig = tsk->signal;
> sighand = rcu_dereference(tsk->sighand);
>
> spin_lock(&sighand->siglock);
> if (sig->group_exit_task && atomic_read(&sig->count) == sig->notify_count)
> wake_up_process(sig->group_exit_task);
> spin_unlock(&sighand->siglock);
>
> atomic_dec(&sig->live);
> if (!atomic_dec_and_test(&sig->count))
> __cleanup_signal(sig);
> }
>
>
> BTW, why sig->xxx members in "signal" struct is guarded by lock in "sighand"
> struct ??
I don't know.
Update version is below.
Thanks,
Hiroshi
========
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
There is a race between de_thread() and copy_process().
When a thread is during execve and another thread is during clone, a exec-ing
thread may be hung up in de_thread() waiting other threads are finished.
The root cause is that cleanup_signal() which is called when fork() failed
doesn't cause wake up the waiting thread at de_thread(), because there is no
check signal->count.
We need the check signal->group_exit_task and signal->notify_count.
Here is a reproducer, it may generate a thread which never die.
void *null_thread(void *p)
{
for (;;)
sleep(1);
return NULL;
}
void *exec_thread(void *p)
{
execl("/bin/true", "/bin/true", NULL);
return null_thread(p);
}
int main(int argc, char **argv)
{
for (;;) {
pid_t pid;
int ret, status;
pid = fork();
if (pid < 0)
break;
if (!pid) {
pthread_t tid;
pthread_create(&tid, NULL, exec_thread, NULL);
for (;;)
pthread_create(&tid, NULL, null_thread, NULL);
}
do {
ret = waitpid(pid, &status, 0);
} while (ret == -1 && errno == EINTR);
}
return 0;
}
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
---
kernel/fork.c | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 3ffa10f..bc41c41 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -886,6 +886,14 @@ static void cleanup_signal(struct task_struct *tsk)
if (atomic_dec_and_test(&sig->count))
__cleanup_signal(sig);
+ else {
+ /* If there is any task waiting for the group exit, notify it */
+ spin_lock(&tsk->sighand->siglock);
+ if (sig->group_exit_task &&
+ atomic_read(&sig->count) == sig->notify_count)
+ wake_up_process(sig->group_exit_task);
+ spin_unlock(&tsk->sighand->siglock);
+ }
}
static void copy_flags(unsigned long clone_flags, struct task_struct *p)
--
1.6.3.3
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] fix race copy_process() vs de_thread()
2009-08-24 4:01 [PATCH] fix race copy_process() vs de_thread() Hiroshi Shimamoto
2009-08-24 5:11 ` KAMEZAWA Hiroyuki
@ 2009-08-24 6:14 ` Roland McGrath
2009-08-24 6:20 ` Roland McGrath
2009-08-24 6:32 ` Hiroshi Shimamoto
1 sibling, 2 replies; 13+ messages in thread
From: Roland McGrath @ 2009-08-24 6:14 UTC (permalink / raw)
To: Hiroshi Shimamoto; +Cc: Andrew Morton, Oleg Nesterov, linux-kernel
I'm not sure I follow the problem scenario you are citing.
The thread in copy_process() will return -ERESTARTNOINTR
after calling cleanup_signal(), which does properly decrement
sig->count to return it to the state before the copy_process() call.
Then this thread will get to signal handling, dequeue its SIGKILL from
zap_other_threads(), and die itself. When it's finally reaped, by itself
in exit_notify(), or by de_thread() in the case of a replaced group_leader,
the normal __exit_signal() will do that group_exit_task logic.
What part of this sequence fails to occur in your tests?
Thanks,
Roland
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] fix race copy_process() vs de_thread()
2009-08-24 6:14 ` [PATCH] " Roland McGrath
@ 2009-08-24 6:20 ` Roland McGrath
2009-08-24 6:32 ` Hiroshi Shimamoto
1 sibling, 0 replies; 13+ messages in thread
From: Roland McGrath @ 2009-08-24 6:20 UTC (permalink / raw)
To: Hiroshi Shimamoto, Andrew Morton, Oleg Nesterov, linux-kernel
Ah, I see. This happens when the copy_process() thread was the old
group_leader itself. Then it never does release_task() itself.
Your fix seems adequate. Another approach would be to have some place
like exit_signals() check for the group_exit_task case.
Oleg, what do you think?
Thanks,
Roland
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] fix race copy_process() vs de_thread()
2009-08-24 6:14 ` [PATCH] " Roland McGrath
2009-08-24 6:20 ` Roland McGrath
@ 2009-08-24 6:32 ` Hiroshi Shimamoto
2009-08-24 8:38 ` Oleg Nesterov
1 sibling, 1 reply; 13+ messages in thread
From: Hiroshi Shimamoto @ 2009-08-24 6:32 UTC (permalink / raw)
To: Roland McGrath; +Cc: Andrew Morton, Oleg Nesterov, linux-kernel
Roland McGrath wrote:
> I'm not sure I follow the problem scenario you are citing.
>
> The thread in copy_process() will return -ERESTARTNOINTR
> after calling cleanup_signal(), which does properly decrement
> sig->count to return it to the state before the copy_process() call.
> Then this thread will get to signal handling, dequeue its SIGKILL from
> zap_other_threads(), and die itself. When it's finally reaped, by itself
> in exit_notify(), or by de_thread() in the case of a replaced group_leader,
> the normal __exit_signal() will do that group_exit_task logic.
>
> What part of this sequence fails to occur in your tests?
The point is that de_thread() waits until other thread calls wake_up_process().
In __exit_signal() when sig->count == 2, the thread calls wake_up_process(),
and then de_thread() will continue. However if another thread is during
copy_process(), the sig->count is incremented at copy_signal(). That makes
no wake_up_process().
The scenario flow is like this;
thread A thread B thread C sig->count
| | | 3
clone() exec() | |
copy_process() de_thread() | |
copy_signal() | | 4
| zap | 4
KILL | KILL 4
| | __exit_signal() 3 no wake up
clean_signal() | 2 no wake up
| | during copy_process()
exit |
never wake up
thanks,
Hiroshi
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] fix race copy_process() vs de_thread()
2009-08-24 6:32 ` Hiroshi Shimamoto
@ 2009-08-24 8:38 ` Oleg Nesterov
2009-08-24 8:53 ` Oleg Nesterov
2009-08-24 16:27 ` [PATCH v3] " Oleg Nesterov
0 siblings, 2 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-08-24 8:38 UTC (permalink / raw)
To: Hiroshi Shimamoto; +Cc: Roland McGrath, Andrew Morton, linux-kernel
On 08/24, Hiroshi Shimamoto wrote:
>
> The point is that de_thread() waits until other thread calls wake_up_process().
> In __exit_signal() when sig->count == 2, the thread calls wake_up_process(),
> and then de_thread() will continue. However if another thread is during
> copy_process(), the sig->count is incremented at copy_signal(). That makes
> no wake_up_process().
Yes. Imho signal->count must die. But I never had time to kill it.
It is not needed. For example, __exit_signal() can just check
thread_group_leader() instead of atomic_dec_and_test(sig->count).
As for this bug, I'd like to think a bit more. But how about the
patch below? With this patch
- copy_process() increments signal/live only when we know
we start the new thread
- if copy_process() fails, we just check CLONE_THREAD.
If true - do nothing, the counters were not changed.
If false - just release ->signal, counters must be 1.
Oleg.
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -815,11 +815,8 @@ static int copy_signal(unsigned long clo
{
struct signal_struct *sig;
- if (clone_flags & CLONE_THREAD) {
- atomic_inc(¤t->signal->count);
- atomic_inc(¤t->signal->live);
+ if (clone_flags & CLONE_THREAD)
return 0;
- }
sig = kmem_cache_alloc(signal_cachep, GFP_KERNEL);
tsk->signal = sig;
@@ -877,16 +874,6 @@ void __cleanup_signal(struct signal_stru
kmem_cache_free(signal_cachep, sig);
}
-static void cleanup_signal(struct task_struct *tsk)
-{
- struct signal_struct *sig = tsk->signal;
-
- atomic_dec(&sig->live);
-
- if (atomic_dec_and_test(&sig->count))
- __cleanup_signal(sig);
-}
-
static void copy_flags(unsigned long clone_flags, struct task_struct *p)
{
unsigned long new_flags = p->flags;
@@ -1239,6 +1226,8 @@ static struct task_struct *copy_process(
}
if (clone_flags & CLONE_THREAD) {
+ atomic_inc(¤t->signal->count);
+ atomic_inc(¤t->signal->live);
p->group_leader = current->group_leader;
list_add_tail_rcu(&p->thread_group, &p->group_leader->thread_group);
}
@@ -1282,7 +1271,8 @@ bad_fork_cleanup_mm:
if (p->mm)
mmput(p->mm);
bad_fork_cleanup_signal:
- cleanup_signal(p);
+ if (!(clone_flags & CLONE_THREAD))
+ __cleanup_signal(p);
bad_fork_cleanup_sighand:
__cleanup_sighand(p->sighand);
bad_fork_cleanup_fs:
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] fix race copy_process() vs de_thread()
2009-08-24 8:38 ` Oleg Nesterov
@ 2009-08-24 8:53 ` Oleg Nesterov
2009-08-24 9:15 ` Roland McGrath
2009-08-24 16:27 ` [PATCH v3] " Oleg Nesterov
1 sibling, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2009-08-24 8:53 UTC (permalink / raw)
To: Hiroshi Shimamoto; +Cc: Roland McGrath, Andrew Morton, linux-kernel
On 08/24, Oleg Nesterov wrote:
>
> On 08/24, Hiroshi Shimamoto wrote:
> >
> > The point is that de_thread() waits until other thread calls wake_up_process().
> > In __exit_signal() when sig->count == 2, the thread calls wake_up_process(),
> > and then de_thread() will continue. However if another thread is during
> > copy_process(), the sig->count is incremented at copy_signal(). That makes
> > no wake_up_process().
>
> Yes. Imho signal->count must die. But I never had time to kill it.
>
> It is not needed. For example, __exit_signal() can just check
> thread_group_leader() instead of atomic_dec_and_test(sig->count).
>
> As for this bug, I'd like to think a bit more. But how about the
> patch below? With this patch
>
> - copy_process() increments signal/live only when we know
> we start the new thread
>
> - if copy_process() fails, we just check CLONE_THREAD.
> If true - do nothing, the counters were not changed.
> If false - just release ->signal, counters must be 1.
Or we can do a bit smaller patch. But in any case, copy_process()
must not use sig->count as a refcounter.
And of course, it would be nice to avoid playing ->notify_count
games in copy_process() pathes.
Oleg
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -816,7 +816,6 @@ static int copy_signal(unsigned long clo
struct signal_struct *sig;
if (clone_flags & CLONE_THREAD) {
- atomic_inc(¤t->signal->count);
atomic_inc(¤t->signal->live);
return 0;
}
@@ -881,9 +880,7 @@ static void cleanup_signal(struct task_s
{
struct signal_struct *sig = tsk->signal;
- atomic_dec(&sig->live);
-
- if (atomic_dec_and_test(&sig->count))
+ if (atomic_dec_and_test(&sig->live))
__cleanup_signal(sig);
}
@@ -1239,6 +1236,7 @@ static struct task_struct *copy_process(
}
if (clone_flags & CLONE_THREAD) {
+ atomic_inc(¤t->signal->count);
p->group_leader = current->group_leader;
list_add_tail_rcu(&p->thread_group, &p->group_leader->thread_group);
}
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] fix race copy_process() vs de_thread()
2009-08-24 8:53 ` Oleg Nesterov
@ 2009-08-24 9:15 ` Roland McGrath
2009-08-24 10:50 ` Oleg Nesterov
0 siblings, 1 reply; 13+ messages in thread
From: Roland McGrath @ 2009-08-24 9:15 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: Hiroshi Shimamoto, Andrew Morton, linux-kernel
I think I like your first patch better. It's a larger patch, but that's
because it winds up with smaller code.
Thanks,
Roland
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] fix race copy_process() vs de_thread()
2009-08-24 9:15 ` Roland McGrath
@ 2009-08-24 10:50 ` Oleg Nesterov
0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-08-24 10:50 UTC (permalink / raw)
To: Roland McGrath; +Cc: Hiroshi Shimamoto, Andrew Morton, linux-kernel
On 08/24, Roland McGrath wrote:
>
> I think I like your first patch better. It's a larger patch, but that's
> because it winds up with smaller code.
Yes, agreed.
And another reason for the first patch: ->live should not be used as a refcnt,
it is not. This should work for copy_process() but looks confusing and relies
on CLONE_THREAD subtleness anyway.
I'll re-check this patch and resend. I guess -stable needs this fix too.
Oleg.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3] fix race copy_process() vs de_thread()
2009-08-24 8:38 ` Oleg Nesterov
2009-08-24 8:53 ` Oleg Nesterov
@ 2009-08-24 16:27 ` Oleg Nesterov
2009-08-24 16:57 ` Roland McGrath
2009-08-25 0:10 ` Hiroshi Shimamoto
1 sibling, 2 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-08-24 16:27 UTC (permalink / raw)
To: Andrew Morton, Hiroshi Shimamoto
Cc: Roland McGrath, linux-kernel, KAMEZAWA Hiroyuki, stable
Spotted by Hiroshi Shimamoto who also provided the test-case below.
copy_process() pathes use signal->count as a reference counter, but
it is not. This test case
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <pthread.h>
void *null_thread(void *p)
{
for (;;)
sleep(1);
return NULL;
}
void *exec_thread(void *p)
{
execl("/bin/true", "/bin/true", NULL);
return null_thread(p);
}
int main(int argc, char **argv)
{
for (;;) {
pid_t pid;
int ret, status;
pid = fork();
if (pid < 0)
break;
if (!pid) {
pthread_t tid;
pthread_create(&tid, NULL, exec_thread, NULL);
for (;;)
pthread_create(&tid, NULL, null_thread, NULL);
}
do {
ret = waitpid(pid, &status, 0);
} while (ret == -1 && errno == EINTR);
}
return 0;
}
quickly creates the unkillable task.
If copy_process(CLONE_THREAD) races with de_thread()
copy_signal()->atomic(signal->count) breaks the signal->notify_count
logic, and the execing thread can hang forever in kernel space.
Change copy_process() to increment count/live only when we know for
sure we can't fail. In this case the forked thread will take care
of its reference to signal correctly.
If copy_process() fails, check CLONE_THREAD flag. If it it set - do
nothing, the counters were not changed and current belongs to the same
thread group. If it is not set, ->signal must be released in any case
(and ->count must be == 1), the forked child is the only thread in the
thread group.
We need more cleanups here, in particular signal->count should not be
used by de_thread/__exit_signal at all. This patch only fixes the bug.
Reported-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/fork.c | 20 +++++---------------
1 file changed, 5 insertions(+), 15 deletions(-)
--- PU/kernel/fork.c~FORK_SIG_CNT 2009-08-04 19:48:28.000000000 +0200
+++ PU/kernel/fork.c 2009-08-24 17:39:59.000000000 +0200
@@ -816,11 +816,8 @@ static int copy_signal(unsigned long clo
{
struct signal_struct *sig;
- if (clone_flags & CLONE_THREAD) {
- atomic_inc(¤t->signal->count);
- atomic_inc(¤t->signal->live);
+ if (clone_flags & CLONE_THREAD)
return 0;
- }
sig = kmem_cache_alloc(signal_cachep, GFP_KERNEL);
tsk->signal = sig;
@@ -878,16 +875,6 @@ void __cleanup_signal(struct signal_stru
kmem_cache_free(signal_cachep, sig);
}
-static void cleanup_signal(struct task_struct *tsk)
-{
- struct signal_struct *sig = tsk->signal;
-
- atomic_dec(&sig->live);
-
- if (atomic_dec_and_test(&sig->count))
- __cleanup_signal(sig);
-}
-
static void copy_flags(unsigned long clone_flags, struct task_struct *p)
{
unsigned long new_flags = p->flags;
@@ -1240,6 +1227,8 @@ static struct task_struct *copy_process(
}
if (clone_flags & CLONE_THREAD) {
+ atomic_inc(¤t->signal->count);
+ atomic_inc(¤t->signal->live);
p->group_leader = current->group_leader;
list_add_tail_rcu(&p->thread_group, &p->group_leader->thread_group);
}
@@ -1282,7 +1271,8 @@ bad_fork_cleanup_mm:
if (p->mm)
mmput(p->mm);
bad_fork_cleanup_signal:
- cleanup_signal(p);
+ if (!(clone_flags & CLONE_THREAD))
+ __cleanup_signal(p->signal);
bad_fork_cleanup_sighand:
__cleanup_sighand(p->sighand);
bad_fork_cleanup_fs:
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3] fix race copy_process() vs de_thread()
2009-08-24 16:27 ` [PATCH v3] " Oleg Nesterov
@ 2009-08-24 16:57 ` Roland McGrath
2009-08-25 0:10 ` Hiroshi Shimamoto
1 sibling, 0 replies; 13+ messages in thread
From: Roland McGrath @ 2009-08-24 16:57 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Hiroshi Shimamoto, linux-kernel, KAMEZAWA Hiroyuki,
stable
Acked-by: Roland McGrath <roland@redhat.com>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3] fix race copy_process() vs de_thread()
2009-08-24 16:27 ` [PATCH v3] " Oleg Nesterov
2009-08-24 16:57 ` Roland McGrath
@ 2009-08-25 0:10 ` Hiroshi Shimamoto
1 sibling, 0 replies; 13+ messages in thread
From: Hiroshi Shimamoto @ 2009-08-25 0:10 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Andrew Morton, Roland McGrath, linux-kernel, KAMEZAWA Hiroyuki,
stable
Oleg Nesterov wrote:
> Spotted by Hiroshi Shimamoto who also provided the test-case below.
>
> copy_process() pathes use signal->count as a reference counter, but
> it is not. This test case
>
> #include <sys/types.h>
> #include <sys/wait.h>
> #include <unistd.h>
> #include <stdio.h>
> #include <errno.h>
> #include <pthread.h>
>
> void *null_thread(void *p)
> {
> for (;;)
> sleep(1);
>
> return NULL;
> }
>
> void *exec_thread(void *p)
> {
> execl("/bin/true", "/bin/true", NULL);
>
> return null_thread(p);
> }
>
> int main(int argc, char **argv)
> {
> for (;;) {
> pid_t pid;
> int ret, status;
>
> pid = fork();
> if (pid < 0)
> break;
>
> if (!pid) {
> pthread_t tid;
>
> pthread_create(&tid, NULL, exec_thread, NULL);
> for (;;)
> pthread_create(&tid, NULL, null_thread, NULL);
> }
>
> do {
> ret = waitpid(pid, &status, 0);
> } while (ret == -1 && errno == EINTR);
> }
>
> return 0;
> }
>
> quickly creates the unkillable task.
>
> If copy_process(CLONE_THREAD) races with de_thread()
> copy_signal()->atomic(signal->count) breaks the signal->notify_count
> logic, and the execing thread can hang forever in kernel space.
>
> Change copy_process() to increment count/live only when we know for
> sure we can't fail. In this case the forked thread will take care
> of its reference to signal correctly.
>
> If copy_process() fails, check CLONE_THREAD flag. If it it set - do
> nothing, the counters were not changed and current belongs to the same
> thread group. If it is not set, ->signal must be released in any case
> (and ->count must be == 1), the forked child is the only thread in the
> thread group.
>
> We need more cleanups here, in particular signal->count should not be
> used by de_thread/__exit_signal at all. This patch only fixes the bug.
>
> Reported-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Nice fix!
Tested-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Thanks,
Hiroshi
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-08-25 0:20 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-24 4:01 [PATCH] fix race copy_process() vs de_thread() Hiroshi Shimamoto
2009-08-24 5:11 ` KAMEZAWA Hiroyuki
2009-08-24 5:58 ` [PATCH v2] " Hiroshi Shimamoto
2009-08-24 6:14 ` [PATCH] " Roland McGrath
2009-08-24 6:20 ` Roland McGrath
2009-08-24 6:32 ` Hiroshi Shimamoto
2009-08-24 8:38 ` Oleg Nesterov
2009-08-24 8:53 ` Oleg Nesterov
2009-08-24 9:15 ` Roland McGrath
2009-08-24 10:50 ` Oleg Nesterov
2009-08-24 16:27 ` [PATCH v3] " Oleg Nesterov
2009-08-24 16:57 ` Roland McGrath
2009-08-25 0:10 ` Hiroshi Shimamoto
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox