public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fix race copy_process() vs de_thread()
@ 2009-08-24  4:01 Hiroshi Shimamoto
  2009-08-24  5:11 ` KAMEZAWA Hiroyuki
  2009-08-24  6:14 ` [PATCH] " Roland McGrath
  0 siblings, 2 replies; 13+ messages in thread
From: Hiroshi Shimamoto @ 2009-08-24  4:01 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Roland McGrath, linux-kernel

From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>

There is a race between de_thread() and copy_process().
When a thread is during execve and another thread is during clone, exec-ing
thread may be hung up in de_thread() waiting other threads are finished.
The root cause is that cleanup_signal() which is called when fork() failed
doesn't cause wake up the waiting thread at de_thread(), because there is no
check signal->count.

We need the check signal->group_exit_task and signal->notify_count.

Here is a reproducer, it may generate a thread which never die.

#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <stdio.h>
#include <errno.h>
#include <pthread.h>

void *null_thread(void *p)
{
	for (;;)
		sleep(1);

	return NULL;
}

void *exec_thread(void *p)
{
	execl("/bin/true", "/bin/true", NULL);

	return null_thread(p);
}

int main(int argc, char **argv)
{
	for (;;) {
		pid_t pid;
		int ret, status;

		pid = fork();
		if (pid < 0)
			break;

		if (!pid) {
			pthread_t tid;

			pthread_create(&tid, NULL, exec_thread, NULL);
			for (;;)
				pthread_create(&tid, NULL, null_thread, NULL);
		}

		do {
			ret = waitpid(pid, &status, 0);
		} while (ret == -1 && errno == EINTR);
	}

	return 0;
}

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
---
 kernel/fork.c |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 3ffa10f..be6c5b5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -882,6 +882,9 @@ static void cleanup_signal(struct task_struct *tsk)
 {
 	struct signal_struct *sig = tsk->signal;
 
+	if (!sig)
+		return;
+
 	atomic_dec(&sig->live);
 
 	if (atomic_dec_and_test(&sig->count))
@@ -1230,6 +1233,15 @@ static struct task_struct *copy_process(unsigned long clone_flags,
  	 */
 	recalc_sigpending();
 	if (signal_pending(current)) {
+		/* If there is any task waiting for the group exit, notify it */
+		if ((clone_flags & CLONE_THREAD) &&
+		    p->signal->group_exit_task) {
+			atomic_dec(&p->signal->live);
+			atomic_dec(&p->signal->count);
+			if (atomic_read(&p->signal->count) == p->signal->notify_count)
+				wake_up_process(p->signal->group_exit_task);
+			p->signal = NULL;
+		}
 		spin_unlock(&current->sighand->siglock);
 		write_unlock_irq(&tasklist_lock);
 		retval = -ERESTARTNOINTR;
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] fix race copy_process() vs de_thread()
  2009-08-24  4:01 [PATCH] fix race copy_process() vs de_thread() Hiroshi Shimamoto
@ 2009-08-24  5:11 ` KAMEZAWA Hiroyuki
  2009-08-24  5:58   ` [PATCH v2] " Hiroshi Shimamoto
  2009-08-24  6:14 ` [PATCH] " Roland McGrath
  1 sibling, 1 reply; 13+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-08-24  5:11 UTC (permalink / raw)
  To: Hiroshi Shimamoto
  Cc: Andrew Morton, Oleg Nesterov, Roland McGrath, linux-kernel

On Mon, 24 Aug 2009 13:01:40 +0900
Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> wrote:
 
> Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
> ---
>  kernel/fork.c |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 3ffa10f..be6c5b5 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -882,6 +882,9 @@ static void cleanup_signal(struct task_struct *tsk)
>  {
>  	struct signal_struct *sig = tsk->signal;
>  
> +	if (!sig)
> +		return;
> +
>  	atomic_dec(&sig->live);
>  
>  	if (atomic_dec_and_test(&sig->count))
> @@ -1230,6 +1233,15 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>   	 */
>  	recalc_sigpending();
>  	if (signal_pending(current)) {
> +		/* If there is any task waiting for the group exit, notify it */
> +		if ((clone_flags & CLONE_THREAD) &&
> +		    p->signal->group_exit_task) {
> +			atomic_dec(&p->signal->live);
> +			atomic_dec(&p->signal->count);
> +			if (atomic_read(&p->signal->count) == p->signal->notify_count)
> +				wake_up_process(p->signal->group_exit_task);
> +			p->signal = NULL;
> +		}
>  		spin_unlock(&current->sighand->siglock);
>  		write_unlock_irq(&tasklist_lock);
>  		retval = -ERESTARTNOINTR;

I'm not sure whehter I catch the issue correctly but, shouldn't we encapsulate
this check in cleanup_signal() ?

like...

static void cleanup_signal(struct task_struct *tsk)
{
        struct signal_struct *sig = tsk->signal;
	sighand = rcu_dereference(tsk->sighand);
 
        spin_lock(&sighand->siglock);
        if (sig->group_exit_task && atomic_read(&sig->count) == sig->notify_count)
 		wake_up_process(sig->group_exit_task);
	spin_unlock(&sighand->siglock);

	atomic_dec(&sig->live);
	if (!atomic_dec_and_test(&sig->count))
		 __cleanup_signal(sig);
}


BTW, why sig->xxx members in "signal" struct is guarded by lock in "sighand"
struct ??

Thanks,
-Kame






^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2] fix race copy_process() vs de_thread()
  2009-08-24  5:11 ` KAMEZAWA Hiroyuki
@ 2009-08-24  5:58   ` Hiroshi Shimamoto
  0 siblings, 0 replies; 13+ messages in thread
From: Hiroshi Shimamoto @ 2009-08-24  5:58 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Andrew Morton, Oleg Nesterov, Roland McGrath, linux-kernel

KAMEZAWA Hiroyuki wrote:
> On Mon, 24 Aug 2009 13:01:40 +0900
> Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> wrote:
>  
>> Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
>> ---
>>  kernel/fork.c |   12 ++++++++++++
>>  1 files changed, 12 insertions(+), 0 deletions(-)
>>
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index 3ffa10f..be6c5b5 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -882,6 +882,9 @@ static void cleanup_signal(struct task_struct *tsk)
>>  {
>>  	struct signal_struct *sig = tsk->signal;
>>  
>> +	if (!sig)
>> +		return;
>> +
>>  	atomic_dec(&sig->live);
>>  
>>  	if (atomic_dec_and_test(&sig->count))
>> @@ -1230,6 +1233,15 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>>   	 */
>>  	recalc_sigpending();
>>  	if (signal_pending(current)) {
>> +		/* If there is any task waiting for the group exit, notify it */
>> +		if ((clone_flags & CLONE_THREAD) &&
>> +		    p->signal->group_exit_task) {
>> +			atomic_dec(&p->signal->live);
>> +			atomic_dec(&p->signal->count);
>> +			if (atomic_read(&p->signal->count) == p->signal->notify_count)
>> +				wake_up_process(p->signal->group_exit_task);
>> +			p->signal = NULL;
>> +		}
>>  		spin_unlock(&current->sighand->siglock);
>>  		write_unlock_irq(&tasklist_lock);
>>  		retval = -ERESTARTNOINTR;
> 
> I'm not sure whehter I catch the issue correctly but, shouldn't we encapsulate
> this check in cleanup_signal() ?

Ah, you're right. clone_process() may fail several reasons and failures after
copy_signal() will cause this issue.

> 
> like...
> 
> static void cleanup_signal(struct task_struct *tsk)
> {
>         struct signal_struct *sig = tsk->signal;
> 	sighand = rcu_dereference(tsk->sighand);
>  
>         spin_lock(&sighand->siglock);
>         if (sig->group_exit_task && atomic_read(&sig->count) == sig->notify_count)
>  		wake_up_process(sig->group_exit_task);
> 	spin_unlock(&sighand->siglock);
> 
> 	atomic_dec(&sig->live);
> 	if (!atomic_dec_and_test(&sig->count))
> 		 __cleanup_signal(sig);
> }
> 
> 
> BTW, why sig->xxx members in "signal" struct is guarded by lock in "sighand"
> struct ??

I don't know.

Update version is below.

Thanks,
Hiroshi

========
From: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>

There is a race between de_thread() and copy_process().
When a thread is during execve and another thread is during clone, a exec-ing
thread may be hung up in de_thread() waiting other threads are finished.
The root cause is that cleanup_signal() which is called when fork() failed
doesn't cause wake up the waiting thread at de_thread(), because there is no
check signal->count.

We need the check signal->group_exit_task and signal->notify_count.

Here is a reproducer, it may generate a thread which never die.

void *null_thread(void *p)
{
	for (;;)
		sleep(1);

	return NULL;
}

void *exec_thread(void *p)
{
	execl("/bin/true", "/bin/true", NULL);

	return null_thread(p);
}

int main(int argc, char **argv)
{
	for (;;) {
		pid_t pid;
		int ret, status;

		pid = fork();
		if (pid < 0)
			break;

		if (!pid) {
			pthread_t tid;

			pthread_create(&tid, NULL, exec_thread, NULL);
			for (;;)
				pthread_create(&tid, NULL, null_thread, NULL);
		}

		do {
			ret = waitpid(pid, &status, 0);
		} while (ret == -1 && errno == EINTR);
	}

	return 0;
}

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
---
 kernel/fork.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/kernel/fork.c b/kernel/fork.c
index 3ffa10f..bc41c41 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -886,6 +886,14 @@ static void cleanup_signal(struct task_struct *tsk)
 
 	if (atomic_dec_and_test(&sig->count))
 		__cleanup_signal(sig);
+	else {
+		/* If there is any task waiting for the group exit, notify it */
+		spin_lock(&tsk->sighand->siglock);
+		if (sig->group_exit_task &&
+		    atomic_read(&sig->count) == sig->notify_count)
+			wake_up_process(sig->group_exit_task);
+		spin_unlock(&tsk->sighand->siglock);
+	}
 }
 
 static void copy_flags(unsigned long clone_flags, struct task_struct *p)
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] fix race copy_process() vs de_thread()
  2009-08-24  4:01 [PATCH] fix race copy_process() vs de_thread() Hiroshi Shimamoto
  2009-08-24  5:11 ` KAMEZAWA Hiroyuki
@ 2009-08-24  6:14 ` Roland McGrath
  2009-08-24  6:20   ` Roland McGrath
  2009-08-24  6:32   ` Hiroshi Shimamoto
  1 sibling, 2 replies; 13+ messages in thread
From: Roland McGrath @ 2009-08-24  6:14 UTC (permalink / raw)
  To: Hiroshi Shimamoto; +Cc: Andrew Morton, Oleg Nesterov, linux-kernel

I'm not sure I follow the problem scenario you are citing.

The thread in copy_process() will return -ERESTARTNOINTR
after calling cleanup_signal(), which does properly decrement
sig->count to return it to the state before the copy_process() call.
Then this thread will get to signal handling, dequeue its SIGKILL from
zap_other_threads(), and die itself.  When it's finally reaped, by itself
in exit_notify(), or by de_thread() in the case of a replaced group_leader,
the normal __exit_signal() will do that group_exit_task logic.

What part of this sequence fails to occur in your tests?


Thanks,
Roland

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] fix race copy_process() vs de_thread()
  2009-08-24  6:14 ` [PATCH] " Roland McGrath
@ 2009-08-24  6:20   ` Roland McGrath
  2009-08-24  6:32   ` Hiroshi Shimamoto
  1 sibling, 0 replies; 13+ messages in thread
From: Roland McGrath @ 2009-08-24  6:20 UTC (permalink / raw)
  To: Hiroshi Shimamoto, Andrew Morton, Oleg Nesterov, linux-kernel

Ah, I see.  This happens when the copy_process() thread was the old
group_leader itself.  Then it never does release_task() itself.

Your fix seems adequate.  Another approach would be to have some place
like exit_signals() check for the group_exit_task case.

Oleg, what do you think?


Thanks,
Roland

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] fix race copy_process() vs de_thread()
  2009-08-24  6:14 ` [PATCH] " Roland McGrath
  2009-08-24  6:20   ` Roland McGrath
@ 2009-08-24  6:32   ` Hiroshi Shimamoto
  2009-08-24  8:38     ` Oleg Nesterov
  1 sibling, 1 reply; 13+ messages in thread
From: Hiroshi Shimamoto @ 2009-08-24  6:32 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Andrew Morton, Oleg Nesterov, linux-kernel

Roland McGrath wrote:
> I'm not sure I follow the problem scenario you are citing.
> 
> The thread in copy_process() will return -ERESTARTNOINTR
> after calling cleanup_signal(), which does properly decrement
> sig->count to return it to the state before the copy_process() call.
> Then this thread will get to signal handling, dequeue its SIGKILL from
> zap_other_threads(), and die itself.  When it's finally reaped, by itself
> in exit_notify(), or by de_thread() in the case of a replaced group_leader,
> the normal __exit_signal() will do that group_exit_task logic.
> 
> What part of this sequence fails to occur in your tests?

The point is that de_thread() waits until other thread calls wake_up_process().
In __exit_signal() when sig->count == 2, the thread calls wake_up_process(),
and then de_thread() will continue. However if another thread is during
copy_process(), the sig->count is incremented at copy_signal(). That makes
no wake_up_process().

The scenario flow is like this;

thread A          thread B    thread C      sig->count
   |                 |           |              3
  clone()          exec()        |              |
  copy_process()   de_thread()   |              |
  copy_signal()      |           |              4
   |                zap          |              4
  KILL               |          KILL            4
   |                 |          __exit_signal() 3 no wake up
  clean_signal()     |                          2 no wake up
   |                 |                             during copy_process()
  exit               |
                never wake up

thanks,
Hiroshi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] fix race copy_process() vs de_thread()
  2009-08-24  6:32   ` Hiroshi Shimamoto
@ 2009-08-24  8:38     ` Oleg Nesterov
  2009-08-24  8:53       ` Oleg Nesterov
  2009-08-24 16:27       ` [PATCH v3] " Oleg Nesterov
  0 siblings, 2 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-08-24  8:38 UTC (permalink / raw)
  To: Hiroshi Shimamoto; +Cc: Roland McGrath, Andrew Morton, linux-kernel

On 08/24, Hiroshi Shimamoto wrote:
>
> The point is that de_thread() waits until other thread calls wake_up_process().
> In __exit_signal() when sig->count == 2, the thread calls wake_up_process(),
> and then de_thread() will continue. However if another thread is during
> copy_process(), the sig->count is incremented at copy_signal(). That makes
> no wake_up_process().

Yes. Imho signal->count must die. But I never had time to kill it.

It is not needed. For example, __exit_signal() can just check
thread_group_leader() instead of atomic_dec_and_test(sig->count).

As for this bug, I'd like to think a bit more. But how about the
patch below? With this patch

	- copy_process() increments signal/live only when we know
	  we start the new thread

	- if copy_process() fails, we just check CLONE_THREAD.
	  If true - do nothing, the counters were not changed.
	  If false - just release ->signal, counters must be 1.

Oleg.

--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -815,11 +815,8 @@ static int copy_signal(unsigned long clo
 {
 	struct signal_struct *sig;
 
-	if (clone_flags & CLONE_THREAD) {
-		atomic_inc(&current->signal->count);
-		atomic_inc(&current->signal->live);
+	if (clone_flags & CLONE_THREAD)
 		return 0;
-	}
 
 	sig = kmem_cache_alloc(signal_cachep, GFP_KERNEL);
 	tsk->signal = sig;
@@ -877,16 +874,6 @@ void __cleanup_signal(struct signal_stru
 	kmem_cache_free(signal_cachep, sig);
 }
 
-static void cleanup_signal(struct task_struct *tsk)
-{
-	struct signal_struct *sig = tsk->signal;
-
-	atomic_dec(&sig->live);
-
-	if (atomic_dec_and_test(&sig->count))
-		__cleanup_signal(sig);
-}
-
 static void copy_flags(unsigned long clone_flags, struct task_struct *p)
 {
 	unsigned long new_flags = p->flags;
@@ -1239,6 +1226,8 @@ static struct task_struct *copy_process(
 	}
 
 	if (clone_flags & CLONE_THREAD) {
+		atomic_inc(&current->signal->count);
+		atomic_inc(&current->signal->live);
 		p->group_leader = current->group_leader;
 		list_add_tail_rcu(&p->thread_group, &p->group_leader->thread_group);
 	}
@@ -1282,7 +1271,8 @@ bad_fork_cleanup_mm:
 	if (p->mm)
 		mmput(p->mm);
 bad_fork_cleanup_signal:
-	cleanup_signal(p);
+	if (!(clone_flags & CLONE_THREAD))
+		__cleanup_signal(p);
 bad_fork_cleanup_sighand:
 	__cleanup_sighand(p->sighand);
 bad_fork_cleanup_fs:


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] fix race copy_process() vs de_thread()
  2009-08-24  8:38     ` Oleg Nesterov
@ 2009-08-24  8:53       ` Oleg Nesterov
  2009-08-24  9:15         ` Roland McGrath
  2009-08-24 16:27       ` [PATCH v3] " Oleg Nesterov
  1 sibling, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2009-08-24  8:53 UTC (permalink / raw)
  To: Hiroshi Shimamoto; +Cc: Roland McGrath, Andrew Morton, linux-kernel

On 08/24, Oleg Nesterov wrote:
>
> On 08/24, Hiroshi Shimamoto wrote:
> >
> > The point is that de_thread() waits until other thread calls wake_up_process().
> > In __exit_signal() when sig->count == 2, the thread calls wake_up_process(),
> > and then de_thread() will continue. However if another thread is during
> > copy_process(), the sig->count is incremented at copy_signal(). That makes
> > no wake_up_process().
>
> Yes. Imho signal->count must die. But I never had time to kill it.
>
> It is not needed. For example, __exit_signal() can just check
> thread_group_leader() instead of atomic_dec_and_test(sig->count).
>
> As for this bug, I'd like to think a bit more. But how about the
> patch below? With this patch
>
> 	- copy_process() increments signal/live only when we know
> 	  we start the new thread
>
> 	- if copy_process() fails, we just check CLONE_THREAD.
> 	  If true - do nothing, the counters were not changed.
> 	  If false - just release ->signal, counters must be 1.

Or we can do a bit smaller patch. But in any case, copy_process()
must not use sig->count as a refcounter.

And of course, it would be nice to avoid playing ->notify_count
games in copy_process() pathes.

Oleg


--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -816,7 +816,6 @@ static int copy_signal(unsigned long clo
 	struct signal_struct *sig;
 
 	if (clone_flags & CLONE_THREAD) {
-		atomic_inc(&current->signal->count);
 		atomic_inc(&current->signal->live);
 		return 0;
 	}
@@ -881,9 +880,7 @@ static void cleanup_signal(struct task_s
 {
 	struct signal_struct *sig = tsk->signal;
 
-	atomic_dec(&sig->live);
-
-	if (atomic_dec_and_test(&sig->count))
+	if (atomic_dec_and_test(&sig->live))
 		__cleanup_signal(sig);
 }
 
@@ -1239,6 +1236,7 @@ static struct task_struct *copy_process(
 	}
 
 	if (clone_flags & CLONE_THREAD) {
+		atomic_inc(&current->signal->count);
 		p->group_leader = current->group_leader;
 		list_add_tail_rcu(&p->thread_group, &p->group_leader->thread_group);
 	}


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] fix race copy_process() vs de_thread()
  2009-08-24  8:53       ` Oleg Nesterov
@ 2009-08-24  9:15         ` Roland McGrath
  2009-08-24 10:50           ` Oleg Nesterov
  0 siblings, 1 reply; 13+ messages in thread
From: Roland McGrath @ 2009-08-24  9:15 UTC (permalink / raw)
  To: Oleg Nesterov; +Cc: Hiroshi Shimamoto, Andrew Morton, linux-kernel

I think I like your first patch better.  It's a larger patch, but that's
because it winds up with smaller code.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] fix race copy_process() vs de_thread()
  2009-08-24  9:15         ` Roland McGrath
@ 2009-08-24 10:50           ` Oleg Nesterov
  0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-08-24 10:50 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Hiroshi Shimamoto, Andrew Morton, linux-kernel

On 08/24, Roland McGrath wrote:
>
> I think I like your first patch better.  It's a larger patch, but that's
> because it winds up with smaller code.

Yes, agreed.

And another reason for the first patch: ->live should not be used as a refcnt,
it is not. This should work for copy_process() but looks confusing and relies
on CLONE_THREAD subtleness anyway.

I'll re-check this patch and resend. I guess -stable needs this fix too.

Oleg.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3] fix race copy_process() vs de_thread()
  2009-08-24  8:38     ` Oleg Nesterov
  2009-08-24  8:53       ` Oleg Nesterov
@ 2009-08-24 16:27       ` Oleg Nesterov
  2009-08-24 16:57         ` Roland McGrath
  2009-08-25  0:10         ` Hiroshi Shimamoto
  1 sibling, 2 replies; 13+ messages in thread
From: Oleg Nesterov @ 2009-08-24 16:27 UTC (permalink / raw)
  To: Andrew Morton, Hiroshi Shimamoto
  Cc: Roland McGrath, linux-kernel, KAMEZAWA Hiroyuki, stable

Spotted by Hiroshi Shimamoto who also provided the test-case below.

copy_process() pathes use signal->count as a reference counter, but
it is not. This test case

	#include <sys/types.h>
	#include <sys/wait.h>
	#include <unistd.h>
	#include <stdio.h>
	#include <errno.h>
	#include <pthread.h>

	void *null_thread(void *p)
	{
		for (;;)
			sleep(1);

		return NULL;
	}

	void *exec_thread(void *p)
	{
		execl("/bin/true", "/bin/true", NULL);

		return null_thread(p);
	}

	int main(int argc, char **argv)
	{
		for (;;) {
			pid_t pid;
			int ret, status;

			pid = fork();
			if (pid < 0)
				break;

			if (!pid) {
				pthread_t tid;

				pthread_create(&tid, NULL, exec_thread, NULL);
				for (;;)
					pthread_create(&tid, NULL, null_thread, NULL);
			}

			do {
				ret = waitpid(pid, &status, 0);
			} while (ret == -1 && errno == EINTR);
		}

		return 0;
	}

quickly creates the unkillable task.

If copy_process(CLONE_THREAD) races with de_thread()
copy_signal()->atomic(signal->count) breaks the signal->notify_count
logic, and the execing thread can hang forever in kernel space.

Change copy_process() to increment count/live only when we know for
sure we can't fail. In this case the forked thread will take care
of its reference to signal correctly.

If copy_process() fails, check CLONE_THREAD flag. If it it set - do
nothing, the counters were not changed and current belongs to the same
thread group. If it is not set, ->signal must be released in any case
(and ->count must be == 1), the forked child is the only thread in the
thread group.

We need more cleanups here, in particular signal->count should not be
used by de_thread/__exit_signal at all. This patch only fixes the bug.

Reported-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/fork.c |   20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

--- PU/kernel/fork.c~FORK_SIG_CNT	2009-08-04 19:48:28.000000000 +0200
+++ PU/kernel/fork.c	2009-08-24 17:39:59.000000000 +0200
@@ -816,11 +816,8 @@ static int copy_signal(unsigned long clo
 {
 	struct signal_struct *sig;
 
-	if (clone_flags & CLONE_THREAD) {
-		atomic_inc(&current->signal->count);
-		atomic_inc(&current->signal->live);
+	if (clone_flags & CLONE_THREAD)
 		return 0;
-	}
 
 	sig = kmem_cache_alloc(signal_cachep, GFP_KERNEL);
 	tsk->signal = sig;
@@ -878,16 +875,6 @@ void __cleanup_signal(struct signal_stru
 	kmem_cache_free(signal_cachep, sig);
 }
 
-static void cleanup_signal(struct task_struct *tsk)
-{
-	struct signal_struct *sig = tsk->signal;
-
-	atomic_dec(&sig->live);
-
-	if (atomic_dec_and_test(&sig->count))
-		__cleanup_signal(sig);
-}
-
 static void copy_flags(unsigned long clone_flags, struct task_struct *p)
 {
 	unsigned long new_flags = p->flags;
@@ -1240,6 +1227,8 @@ static struct task_struct *copy_process(
 	}
 
 	if (clone_flags & CLONE_THREAD) {
+		atomic_inc(&current->signal->count);
+		atomic_inc(&current->signal->live);
 		p->group_leader = current->group_leader;
 		list_add_tail_rcu(&p->thread_group, &p->group_leader->thread_group);
 	}
@@ -1282,7 +1271,8 @@ bad_fork_cleanup_mm:
 	if (p->mm)
 		mmput(p->mm);
 bad_fork_cleanup_signal:
-	cleanup_signal(p);
+	if (!(clone_flags & CLONE_THREAD))
+		__cleanup_signal(p->signal);
 bad_fork_cleanup_sighand:
 	__cleanup_sighand(p->sighand);
 bad_fork_cleanup_fs:


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] fix race copy_process() vs de_thread()
  2009-08-24 16:27       ` [PATCH v3] " Oleg Nesterov
@ 2009-08-24 16:57         ` Roland McGrath
  2009-08-25  0:10         ` Hiroshi Shimamoto
  1 sibling, 0 replies; 13+ messages in thread
From: Roland McGrath @ 2009-08-24 16:57 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Hiroshi Shimamoto, linux-kernel, KAMEZAWA Hiroyuki,
	stable

Acked-by: Roland McGrath <roland@redhat.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3] fix race copy_process() vs de_thread()
  2009-08-24 16:27       ` [PATCH v3] " Oleg Nesterov
  2009-08-24 16:57         ` Roland McGrath
@ 2009-08-25  0:10         ` Hiroshi Shimamoto
  1 sibling, 0 replies; 13+ messages in thread
From: Hiroshi Shimamoto @ 2009-08-25  0:10 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Roland McGrath, linux-kernel, KAMEZAWA Hiroyuki,
	stable

Oleg Nesterov wrote:
> Spotted by Hiroshi Shimamoto who also provided the test-case below.
> 
> copy_process() pathes use signal->count as a reference counter, but
> it is not. This test case
> 
> 	#include <sys/types.h>
> 	#include <sys/wait.h>
> 	#include <unistd.h>
> 	#include <stdio.h>
> 	#include <errno.h>
> 	#include <pthread.h>
> 
> 	void *null_thread(void *p)
> 	{
> 		for (;;)
> 			sleep(1);
> 
> 		return NULL;
> 	}
> 
> 	void *exec_thread(void *p)
> 	{
> 		execl("/bin/true", "/bin/true", NULL);
> 
> 		return null_thread(p);
> 	}
> 
> 	int main(int argc, char **argv)
> 	{
> 		for (;;) {
> 			pid_t pid;
> 			int ret, status;
> 
> 			pid = fork();
> 			if (pid < 0)
> 				break;
> 
> 			if (!pid) {
> 				pthread_t tid;
> 
> 				pthread_create(&tid, NULL, exec_thread, NULL);
> 				for (;;)
> 					pthread_create(&tid, NULL, null_thread, NULL);
> 			}
> 
> 			do {
> 				ret = waitpid(pid, &status, 0);
> 			} while (ret == -1 && errno == EINTR);
> 		}
> 
> 		return 0;
> 	}
> 
> quickly creates the unkillable task.
> 
> If copy_process(CLONE_THREAD) races with de_thread()
> copy_signal()->atomic(signal->count) breaks the signal->notify_count
> logic, and the execing thread can hang forever in kernel space.
> 
> Change copy_process() to increment count/live only when we know for
> sure we can't fail. In this case the forked thread will take care
> of its reference to signal correctly.
> 
> If copy_process() fails, check CLONE_THREAD flag. If it it set - do
> nothing, the counters were not changed and current belongs to the same
> thread group. If it is not set, ->signal must be released in any case
> (and ->count must be == 1), the forked child is the only thread in the
> thread group.
> 
> We need more cleanups here, in particular signal->count should not be
> used by de_thread/__exit_signal at all. This patch only fixes the bug.
> 
> Reported-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Nice fix!
Tested-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>

Thanks,
Hiroshi

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-08-25  0:20 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-24  4:01 [PATCH] fix race copy_process() vs de_thread() Hiroshi Shimamoto
2009-08-24  5:11 ` KAMEZAWA Hiroyuki
2009-08-24  5:58   ` [PATCH v2] " Hiroshi Shimamoto
2009-08-24  6:14 ` [PATCH] " Roland McGrath
2009-08-24  6:20   ` Roland McGrath
2009-08-24  6:32   ` Hiroshi Shimamoto
2009-08-24  8:38     ` Oleg Nesterov
2009-08-24  8:53       ` Oleg Nesterov
2009-08-24  9:15         ` Roland McGrath
2009-08-24 10:50           ` Oleg Nesterov
2009-08-24 16:27       ` [PATCH v3] " Oleg Nesterov
2009-08-24 16:57         ` Roland McGrath
2009-08-25  0:10         ` Hiroshi Shimamoto

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox