public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH]CPU hotplug breaks wake_up_new_task
@ 2005-05-31  7:35 Shaohua Li
  2005-05-31  8:00 ` Ashok Raj
  2005-05-31  9:40 ` Srivatsa Vaddagiri
  0 siblings, 2 replies; 11+ messages in thread
From: Shaohua Li @ 2005-05-31  7:35 UTC (permalink / raw)
  To: lkml; +Cc: akpm

Hi,
There is a race condition at wake_up_new_task at CPU hotplug case.
Say do_fork
        copy_process (which sets new forked task's current cpu, cpu_allowed)
                <-------- the new forked task's current cpu is offline
        wake_up_new_task
wake_up_new_task will put the forked task into a dead cpu.

Thanks,
Shaohua

Signed-off-by: Shaohua Li<shaohua.li@intel.com>
---

 linux-2.6.11-rc5-mm1-root/kernel/sched.c |   25 +++++++++++++++++++++++--
 1 files changed, 23 insertions(+), 2 deletions(-)

diff -puN kernel/sched.c~wake_up_new_task_to_online_cpu kernel/sched.c
--- linux-2.6.11-rc5-mm1/kernel/sched.c~wake_up_new_task_to_online_cpu	2005-05-31 13:39:43.888682784 +0800
+++ linux-2.6.11-rc5-mm1-root/kernel/sched.c	2005-05-31 14:49:58.537958704 +0800
@@ -1412,6 +1412,10 @@ void fastcall sched_fork(task_t *p, int 
 	put_cpu();
 }
 
+#ifdef CONFIG_HOTPLUG_CPU
+static int task_select_online_cpu(int dead_cpu, struct task_struct *tsk);
+#endif
+
 /*
  * wake_up_new_task - wake up a newly created task for the first time.
  *
@@ -1426,9 +1430,20 @@ void fastcall wake_up_new_task(task_t * 
 	runqueue_t *rq, *this_rq;
 
 	rq = task_rq_lock(p, &flags);
-	BUG_ON(p->state != TASK_RUNNING);
 	this_cpu = smp_processor_id();
 	cpu = task_cpu(p);
+#ifdef CONFIG_HOTPLUG_CPU
+	while (!cpu_online(cpu)) {
+		cpu = task_select_online_cpu(cpu, p);
+		set_task_cpu(p, cpu);
+		task_rq_unlock(rq, &flags);
+		/* CPU hotplug might occur here */
+		rq = task_rq_lock(p, &flags);
+		this_cpu = smp_processor_id();
+		cpu = task_cpu(p);
+	}
+#endif
+	BUG_ON(p->state != TASK_RUNNING);
 
 	/*
 	 * We decrease the sleep average of forking parents
@@ -4457,7 +4472,7 @@ wait_to_die:
 
 #ifdef CONFIG_HOTPLUG_CPU
 /* Figure out where task on dead CPU should go, use force if neccessary. */
-static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *tsk)
+static int task_select_online_cpu(int dead_cpu, struct task_struct *tsk)
 {
 	int dest_cpu;
 	cpumask_t mask;
@@ -4486,6 +4501,12 @@ static void move_task_off_dead_cpu(int d
 			       "longer affine to cpu%d\n",
 			       tsk->pid, tsk->comm, dead_cpu);
 	}
+	return dest_cpu;
+}
+
+static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *tsk)
+{
+	int dest_cpu = task_select_online_cpu(dead_cpu, tsk);
 	__migrate_task(tsk, dead_cpu, dest_cpu);
 }
 
_



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]CPU hotplug breaks wake_up_new_task
  2005-05-31  7:35 [PATCH]CPU hotplug breaks wake_up_new_task Shaohua Li
@ 2005-05-31  8:00 ` Ashok Raj
  2005-05-31  8:35   ` Shaohua Li
  2005-05-31  9:40 ` Srivatsa Vaddagiri
  1 sibling, 1 reply; 11+ messages in thread
From: Ashok Raj @ 2005-05-31  8:00 UTC (permalink / raw)
  To: Shaohua Li; +Cc: lkml, akpm, nickpiggin

On Tue, May 31, 2005 at 12:35:09AM -0700, Shaohua Li wrote:
> 
>    Hi,
>    There is a race condition at wake_up_new_task at CPU hotplug case.
>    Say do_fork
>             copy_process  (which  sets  new  forked  task's  current cpu,
>    cpu_allowed)
>                    <-------- the new forked task's current cpu is offline
>            wake_up_new_task
>    wake_up_new_task will put the forked task into a dead cpu.
> 
>    Thanks,
>    Shaohua
> 
>    Signed-off-by: Shaohua Li<shaohua.li@intel.com>
>    ---
> 
>     linux-2.6.11-rc5-mm1-root/kernel/sched.c            |              25
.... Deleted....
>     /*
>       *  wake_up_new_task  -  wake  up a newly created task for the first
>    time.
>      *
>    @@ -1426,9 +1430,20 @@ void fastcall wake_up_new_task(task_t *
>            runqueue_t *rq, *this_rq;
> 
>            rq = task_rq_lock(p, &flags);
>    -       BUG_ON(p->state != TASK_RUNNING);
>            this_cpu = smp_processor_id();
>            cpu = task_cpu(p);
>    +#ifdef CONFIG_HOTPLUG_CPU
>    +       while (!cpu_online(cpu)) {
>    +               cpu = task_select_online_cpu(cpu, p);
>    +               set_task_cpu(p, cpu);
>    +               task_rq_unlock(rq, &flags);
>    +               /* CPU hotplug might occur here */
>    +               rq = task_rq_lock(p, &flags);
>    +               this_cpu = smp_processor_id();
>    +               cpu = task_cpu(p);
>    +       }
>    +#endif

The while() loop doesnt look pretty here.. could you try to 
disable preempt, and see the problem goes away? or use 
get_cpu()/put_cpu() combo when you get this_cpu?

Just wondering if the code would be a little more simpler in this
case.

Nick should have something to say ...

Cheers,
ashok

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]CPU hotplug breaks wake_up_new_task
  2005-05-31  8:00 ` Ashok Raj
@ 2005-05-31  8:35   ` Shaohua Li
  2005-05-31  8:50     ` Nick Piggin
  0 siblings, 1 reply; 11+ messages in thread
From: Shaohua Li @ 2005-05-31  8:35 UTC (permalink / raw)
  To: Ashok Raj; +Cc: lkml, akpm, nickpiggin

On Tue, 2005-05-31 at 01:00 -0700, Ashok Raj wrote:
> On Tue, May 31, 2005 at 12:35:09AM -0700, Shaohua Li wrote:
> > 
> >    Hi,
> >    There is a race condition at wake_up_new_task at CPU hotplug case.
> >    Say do_fork
> >             copy_process  (which  sets  new  forked  task's  current cpu,
> >    cpu_allowed)
> >                    <-------- the new forked task's current cpu is offline
> >            wake_up_new_task
> >    wake_up_new_task will put the forked task into a dead cpu.
> > 
> 
> The while() loop doesnt look pretty here.. could you try to 
> disable preempt, and see the problem goes away? or use 
> get_cpu()/put_cpu() combo when you get this_cpu?
> 
> Just wondering if the code would be a little more simpler in this
> case.
I must be over considering. Ok, how does this updated one look?


Signed-off-by: Shaohua Li<shaohua.li@intel.com>
---

 linux-2.6.11-rc5-mm1-root/kernel/sched.c |   23 +++++++++++++++++++++--
 1 files changed, 21 insertions(+), 2 deletions(-)

diff -puN kernel/sched.c~wake_up_new_task_to_online_cpu kernel/sched.c
--- linux-2.6.11-rc5-mm1/kernel/sched.c~wake_up_new_task_to_online_cpu	2005-05-31 13:39:43.888682784 +0800
+++ linux-2.6.11-rc5-mm1-root/kernel/sched.c	2005-05-31 16:14:37.390855704 +0800
@@ -1412,6 +1412,10 @@ void fastcall sched_fork(task_t *p, int 
 	put_cpu();
 }
 
+#ifdef CONFIG_HOTPLUG_CPU
+static int task_select_online_cpu(int dead_cpu, struct task_struct *tsk);
+#endif
+
 /*
  * wake_up_new_task - wake up a newly created task for the first time.
  *
@@ -1425,10 +1429,18 @@ void fastcall wake_up_new_task(task_t * 
 	int this_cpu, cpu;
 	runqueue_t *rq, *this_rq;
 
+	this_cpu = get_cpu();
 	rq = task_rq_lock(p, &flags);
 	BUG_ON(p->state != TASK_RUNNING);
-	this_cpu = smp_processor_id();
 	cpu = task_cpu(p);
+#ifdef CONFIG_HOTPLUG_CPU
+	if (!cpu_online(cpu)) {
+		cpu = task_select_online_cpu(cpu, p);
+		set_task_cpu(p, cpu);
+		task_rq_unlock(rq, &flags);
+		rq = task_rq_lock(p, &flags);
+	}
+#endif
 
 	/*
 	 * We decrease the sleep average of forking parents
@@ -1491,6 +1503,7 @@ void fastcall wake_up_new_task(task_t * 
 	current->sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(current) *
 		PARENT_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
 	task_rq_unlock(this_rq, &flags);
+	put_cpu();
 }
 
 /*
@@ -4457,7 +4470,7 @@ wait_to_die:
 
 #ifdef CONFIG_HOTPLUG_CPU
 /* Figure out where task on dead CPU should go, use force if neccessary. */
-static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *tsk)
+static int task_select_online_cpu(int dead_cpu, struct task_struct *tsk)
 {
 	int dest_cpu;
 	cpumask_t mask;
@@ -4486,6 +4499,12 @@ static void move_task_off_dead_cpu(int d
 			       "longer affine to cpu%d\n",
 			       tsk->pid, tsk->comm, dead_cpu);
 	}
+	return dest_cpu;
+}
+
+static void move_task_off_dead_cpu(int dead_cpu, struct task_struct *tsk)
+{
+	int dest_cpu = task_select_online_cpu(dead_cpu, tsk);
 	__migrate_task(tsk, dead_cpu, dest_cpu);
 }
 
_



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]CPU hotplug breaks wake_up_new_task
  2005-05-31  8:35   ` Shaohua Li
@ 2005-05-31  8:50     ` Nick Piggin
  2005-05-31  9:11       ` Shaohua Li
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Piggin @ 2005-05-31  8:50 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Ashok Raj, lkml, akpm, Ingo Molnar

Shaohua Li wrote:

> I must be over considering. Ok, how does this updated one look?
> 
> 

Looks like you've found a race, alright. Nice work!

I think it would be preferable to do the check in kernel/fork.c,
after the tasklist lock is taken (and you'll need to rediff the
patch for the -mm tree).

One problem with doing it in wake_up_new_task is that I think
some newly forked tasks don't get woken up by wake_up_new_task!

Thanks,
Nick
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]CPU hotplug breaks wake_up_new_task
  2005-05-31  9:11       ` Shaohua Li
@ 2005-05-31  9:08         ` Nick Piggin
  0 siblings, 0 replies; 11+ messages in thread
From: Nick Piggin @ 2005-05-31  9:08 UTC (permalink / raw)
  To: Shaohua Li; +Cc: Ashok Raj, lkml, akpm, Ingo Molnar

Shaohua Li wrote:
> On Tue, 2005-05-31 at 18:50 +1000, Nick Piggin wrote:
> 
>>Shaohua Li wrote:
>>
>>
>>>I must be over considering. Ok, how does this updated one look?
>>>
>>>
>>
>>Looks like you've found a race, alright. Nice work!
>>
>>I think it would be preferable to do the check in kernel/fork.c,
>>after the tasklist lock is taken (and you'll need to rediff the
>>patch for the -mm tree).
> 
> It seems there is still a race between copy_process and wake_up_new_task
> to me (cpu offline after copy_process). Am I missing anything?
> 

Offlining the CPU takes the tasklist lock to migrate off tasks.
So either the CPU will be offline first, in which case the a
check in kernel/fork.c will pick that up; or the task will be
added to the tasklist first, in which case CPU hotplug should
correctly migrate it away.

I think?

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]CPU hotplug breaks wake_up_new_task
  2005-05-31  8:50     ` Nick Piggin
@ 2005-05-31  9:11       ` Shaohua Li
  2005-05-31  9:08         ` Nick Piggin
  0 siblings, 1 reply; 11+ messages in thread
From: Shaohua Li @ 2005-05-31  9:11 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Ashok Raj, lkml, akpm, Ingo Molnar

On Tue, 2005-05-31 at 18:50 +1000, Nick Piggin wrote:
> Shaohua Li wrote:
> 
> > I must be over considering. Ok, how does this updated one look?
> > 
> > 
> 
> Looks like you've found a race, alright. Nice work!
> 
> I think it would be preferable to do the check in kernel/fork.c,
> after the tasklist lock is taken (and you'll need to rediff the
> patch for the -mm tree).
It seems there is still a race between copy_process and wake_up_new_task
to me (cpu offline after copy_process). Am I missing anything?

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]CPU hotplug breaks wake_up_new_task
  2005-05-31  7:35 [PATCH]CPU hotplug breaks wake_up_new_task Shaohua Li
  2005-05-31  8:00 ` Ashok Raj
@ 2005-05-31  9:40 ` Srivatsa Vaddagiri
  2005-05-31  9:46   ` Nick Piggin
  1 sibling, 1 reply; 11+ messages in thread
From: Srivatsa Vaddagiri @ 2005-05-31  9:40 UTC (permalink / raw)
  To: Shaohua Li; +Cc: lkml, akpm, Ingo Molnar, Rusty Russell, Nick Piggin, ashok.raj

On Tue, May 31, 2005 at 07:29:39AM +0000, Shaohua Li wrote:
> Hi,
> There is a race condition at wake_up_new_task at CPU hotplug case.
> Say do_fork
>         copy_process (which sets new forked task's current cpu, cpu_allowed)
>                 <-------- the new forked task's current cpu is offline
>         wake_up_new_task
> wake_up_new_task will put the forked task into a dead cpu.

This was noticed/fixed long back. Apparently somebody has reintroduced
the bug. The simple fix for this race is:


--- kernel/fork.c.org	2005-05-31 14:57:15.000000000 +0530
+++ kernel/fork.c	2005-05-31 15:07:20.000000000 +0530
@@ -1024,8 +1024,7 @@ static task_t *copy_process(unsigned lon
 	 * parent's CPU). This avoids alot of nasty races.
 	 */
 	p->cpus_allowed = current->cpus_allowed;
-	if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed)))
-		set_task_cpu(p, smp_processor_id());
+	set_task_cpu(p, smp_processor_id());
 
 	/*
 	 * Check for pending SIGKILL! The new thread should not be allowed

Could you test and check if it avoids whatever problem you are seeing?


-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]CPU hotplug breaks wake_up_new_task
  2005-05-31  9:40 ` Srivatsa Vaddagiri
@ 2005-05-31  9:46   ` Nick Piggin
  2005-05-31 10:40     ` Srivatsa Vaddagiri
  0 siblings, 1 reply; 11+ messages in thread
From: Nick Piggin @ 2005-05-31  9:46 UTC (permalink / raw)
  To: vatsa; +Cc: Shaohua Li, lkml, akpm, Ingo Molnar, Rusty Russell, ashok.raj

Srivatsa Vaddagiri wrote:
> On Tue, May 31, 2005 at 07:29:39AM +0000, Shaohua Li wrote:
> 
>>Hi,
>>There is a race condition at wake_up_new_task at CPU hotplug case.
>>Say do_fork
>>        copy_process (which sets new forked task's current cpu, cpu_allowed)
>>                <-------- the new forked task's current cpu is offline
>>        wake_up_new_task
>>wake_up_new_task will put the forked task into a dead cpu.
> 
> 
> This was noticed/fixed long back. Apparently somebody has reintroduced
> the bug. The simple fix for this race is:
> 

If you're looking at the -mm tree, then I think I must
have reintroduced the bug. It needs a comment.

> 
> --- kernel/fork.c.org	2005-05-31 14:57:15.000000000 +0530
> +++ kernel/fork.c	2005-05-31 15:07:20.000000000 +0530
> @@ -1024,8 +1024,7 @@ static task_t *copy_process(unsigned lon
>  	 * parent's CPU). This avoids alot of nasty races.
>  	 */
>  	p->cpus_allowed = current->cpus_allowed;
> -	if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed)))
> -		set_task_cpu(p, smp_processor_id());
> +	set_task_cpu(p, smp_processor_id());
>  
>  	/*
>  	 * Check for pending SIGKILL! The new thread should not be allowed
> 
> Could you test and check if it avoids whatever problem you are seeing?
> 
> 

And this patch will break balance-on-fork.
How about conditionally setting task_cpu if the task's current
CPU is offline?

Thanks,
Nick

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]CPU hotplug breaks wake_up_new_task
  2005-05-31  9:46   ` Nick Piggin
@ 2005-05-31 10:40     ` Srivatsa Vaddagiri
  2005-05-31 10:49       ` Nick Piggin
  0 siblings, 1 reply; 11+ messages in thread
From: Srivatsa Vaddagiri @ 2005-05-31 10:40 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Shaohua Li, lkml, akpm, Ingo Molnar, Rusty Russell, ashok.raj

On Tue, May 31, 2005 at 07:46:13PM +1000, Nick Piggin wrote:
> And this patch will break balance-on-fork.

Right :-)

> How about conditionally setting task_cpu if the task's current
> CPU is offline?

Something like this?

--- kernel/fork.c.org	2005-05-31 14:57:15.000000000 +0530
+++ kernel/fork.c	2005-05-31 16:06:41.000000000 +0530
@@ -1024,7 +1024,8 @@ static task_t *copy_process(unsigned lon
 	 * parent's CPU). This avoids alot of nasty races.
 	 */
 	p->cpus_allowed = current->cpus_allowed;
-	if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed)))
+	if (unlikely(!cpu_isset(task_cpu(p), p->cpus_allowed) ||
+		     !cpu_online(task_cpu(p))))
 		set_task_cpu(p, smp_processor_id());
 
 	/*



-- 


Thanks and Regards,
Srivatsa Vaddagiri,
Linux Technology Center,
IBM Software Labs,
Bangalore, INDIA - 560017

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH]CPU hotplug breaks wake_up_new_task
  2005-05-31 10:40     ` Srivatsa Vaddagiri
@ 2005-05-31 10:49       ` Nick Piggin
  0 siblings, 0 replies; 11+ messages in thread
From: Nick Piggin @ 2005-05-31 10:49 UTC (permalink / raw)
  To: vatsa; +Cc: Shaohua Li, lkml, akpm, Ingo Molnar, Rusty Russell, ashok.raj

Srivatsa Vaddagiri wrote:
> On Tue, May 31, 2005 at 07:46:13PM +1000, Nick Piggin wrote:
> 
>>And this patch will break balance-on-fork.
> 
> 
> Right :-)
> 
> 
>>How about conditionally setting task_cpu if the task's current
>>CPU is offline?
> 
> 
> Something like this?
> 

That's exactly what I had in mind ;)
Shaohua, do you agree?

-- 
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH]CPU hotplug breaks wake_up_new_task
@ 2005-05-31 11:41 Li, Shaohua
  0 siblings, 0 replies; 11+ messages in thread
From: Li, Shaohua @ 2005-05-31 11:41 UTC (permalink / raw)
  To: Nick Piggin, vatsa; +Cc: lkml, akpm, Ingo Molnar, Rusty Russell, Raj, Ashok

>
>Srivatsa Vaddagiri wrote:
>> On Tue, May 31, 2005 at 07:46:13PM +1000, Nick Piggin wrote:
>>
>>>And this patch will break balance-on-fork.
>>
>>
>> Right :-)
>>
>>
>>>How about conditionally setting task_cpu if the task's current
>>>CPU is offline?
>>
>>
>> Something like this?
>>
>
>That's exactly what I had in mind ;)
>Shaohua, do you agree?
That's better. Thanks!

Cheers,
Shaohua

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-05-31 11:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-31  7:35 [PATCH]CPU hotplug breaks wake_up_new_task Shaohua Li
2005-05-31  8:00 ` Ashok Raj
2005-05-31  8:35   ` Shaohua Li
2005-05-31  8:50     ` Nick Piggin
2005-05-31  9:11       ` Shaohua Li
2005-05-31  9:08         ` Nick Piggin
2005-05-31  9:40 ` Srivatsa Vaddagiri
2005-05-31  9:46   ` Nick Piggin
2005-05-31 10:40     ` Srivatsa Vaddagiri
2005-05-31 10:49       ` Nick Piggin
  -- strict thread matches above, loose matches on Subject: below --
2005-05-31 11:41 Li, Shaohua

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox